This document provides an outline for a tutorial on graph signal processing (GSP) for machine learning. It will include a brief introduction to GSP, key GSP tools for machine learning, and how GSP can help address challenges related to exploiting data structure, improving efficiency/robustness, and enhancing model interpretability. Applications of GSP for machine learning will also be discussed. The tutorial will conclude with a summary of open challenges and new perspectives in the field.
The document discusses neural networks based on competition. It describes three fixed-weight competitive neural networks: Maxnet, Mexican Hat, and Hamming Net. Maxnet uses winner-take-all competition where only the neuron with the largest activation remains active. The Mexican Hat network enhances the activation of neurons receiving a stronger external signal by applying positive weights to nearby neurons and negative weights to those further away. An example demonstrates how the Mexican Hat network increases contrast over iterations.
This document discusses using the Kalman filter for object tracking. It begins by introducing the Kalman filter as a linear discrete-time system and describes its process and measurement equations. It then discusses using the Kalman filter to optimally estimate parameters and extend it to model non-linear systems using a Taylor series approximation. The document describes using the basic and extended Kalman filters for object tracking by initializing the object position and iteratively predicting and correcting its state. It also discusses combining the Kalman filter with mean shift for object tracking and using an adaptive Kalman filter to handle occlusions.
ELM: Extreme Learning Machine: Learning without iterative tuningzukun
The document outlines a presentation on extreme learning machines (ELM). It includes four sections that cover: 1) feedforward neural networks and single-hidden layer feedforward networks (SLFNs), 2) ELM methodology including generalized SLFNs and learning without iterative tuning, 3) comparisons between ELM and conventional support vector machines (SVMs), and 4) online sequential ELM. The outline provides sub-topics to be discussed within each section.
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...lauratoni4
This document provides an overview of graph signal processing for machine learning. It discusses how networks can be represented as graphs and how graph-structured data is ubiquitous, appearing in applications involving networks of regions, road junctions, individuals, brain regions, and more. It also outlines challenges for using graph signal processing to exploit data structure, improve efficiency and robustness, and enhance model interpretability. The document concludes with a discussion of applications and open challenges.
HML: Historical View and Trends of Deep LearningYan Xu
The document provides a historical view and trends of deep learning. It discusses that deep learning models have evolved in several waves since the 1940s, with key developments including the backpropagation algorithm in 1986 and deep belief networks with pretraining in 2006. Current trends include growing datasets, increasing numbers of neurons and connections per neuron, and higher accuracy on tasks involving vision, NLP and games. Research trends focus on generative models, domain alignment, meta-learning, using graphs as inputs, and program induction.
hetero associative memory is a single layer neural network. However, in this network the input training vector and the output target vectors are not the same. The weights are determined so that the network stores a set of patterns. Hetero associative network is static in nature, hence, there would be no non-linear and delay operations.
Slide explaining the distinction between bagging and boosting while understanding the bias variance trade-off. Followed by some lesser known scope of supervised learning. understanding the effect of tree split metric in deciding feature importance. Then understanding the effect of threshold on classification accuracy. Additionally, how to adjust model threshold for classification in supervised learning.
Note: Limitation of Accuracy metric (baseline accuracy), alternative metrics, their use case and their advantage and limitations were briefly discussed.
The document discusses neural networks based on competition. It describes three fixed-weight competitive neural networks: Maxnet, Mexican Hat, and Hamming Net. Maxnet uses winner-take-all competition where only the neuron with the largest activation remains active. The Mexican Hat network enhances the activation of neurons receiving a stronger external signal by applying positive weights to nearby neurons and negative weights to those further away. An example demonstrates how the Mexican Hat network increases contrast over iterations.
This document discusses using the Kalman filter for object tracking. It begins by introducing the Kalman filter as a linear discrete-time system and describes its process and measurement equations. It then discusses using the Kalman filter to optimally estimate parameters and extend it to model non-linear systems using a Taylor series approximation. The document describes using the basic and extended Kalman filters for object tracking by initializing the object position and iteratively predicting and correcting its state. It also discusses combining the Kalman filter with mean shift for object tracking and using an adaptive Kalman filter to handle occlusions.
ELM: Extreme Learning Machine: Learning without iterative tuningzukun
The document outlines a presentation on extreme learning machines (ELM). It includes four sections that cover: 1) feedforward neural networks and single-hidden layer feedforward networks (SLFNs), 2) ELM methodology including generalized SLFNs and learning without iterative tuning, 3) comparisons between ELM and conventional support vector machines (SVMs), and 4) online sequential ELM. The outline provides sub-topics to be discussed within each section.
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...lauratoni4
This document provides an overview of graph signal processing for machine learning. It discusses how networks can be represented as graphs and how graph-structured data is ubiquitous, appearing in applications involving networks of regions, road junctions, individuals, brain regions, and more. It also outlines challenges for using graph signal processing to exploit data structure, improve efficiency and robustness, and enhance model interpretability. The document concludes with a discussion of applications and open challenges.
HML: Historical View and Trends of Deep LearningYan Xu
The document provides a historical view and trends of deep learning. It discusses that deep learning models have evolved in several waves since the 1940s, with key developments including the backpropagation algorithm in 1986 and deep belief networks with pretraining in 2006. Current trends include growing datasets, increasing numbers of neurons and connections per neuron, and higher accuracy on tasks involving vision, NLP and games. Research trends focus on generative models, domain alignment, meta-learning, using graphs as inputs, and program induction.
hetero associative memory is a single layer neural network. However, in this network the input training vector and the output target vectors are not the same. The weights are determined so that the network stores a set of patterns. Hetero associative network is static in nature, hence, there would be no non-linear and delay operations.
Slide explaining the distinction between bagging and boosting while understanding the bias variance trade-off. Followed by some lesser known scope of supervised learning. understanding the effect of tree split metric in deciding feature importance. Then understanding the effect of threshold on classification accuracy. Additionally, how to adjust model threshold for classification in supervised learning.
Note: Limitation of Accuracy metric (baseline accuracy), alternative metrics, their use case and their advantage and limitations were briefly discussed.
Radial basis function network ppt bySheetal,Samreen and Dhanashrisheetal katkar
Radial Basis Functions are nonlinear activation functions used by artificial neural networks.Explained commonly used RBFs ,cover's theorem,interpolation problem and learning strategies.
Extreme learning machine:Theory and applicationsJames Chou
The document presents the extreme learning machine (ELM) theory and algorithm. ELM is a learning algorithm for single-hidden layer feedforward neural networks. Unlike traditional algorithms, ELM assigns input weights and hidden biases randomly and computes output weights through Moore-Penrose generalized inverse, making it faster than backpropagation. The paper evaluates ELM on regression and classification tasks, finding it achieves comparable or better performance than other algorithms while requiring less training time.
This document discusses Bayesian global optimization as a method for tuning machine learning models. It begins by outlining challenges with traditional tuning methods like grid search and random search. It then introduces Bayesian global optimization, which uses a Gaussian process model and expected improvement criterion to efficiently search the parameter space. The document provides examples of applying Bayesian optimization to deep learning tasks in MXNet and TensorFlow to achieve faster and better performance than traditional methods. It concludes by discussing tools for evaluating optimization strategies and comparing Bayesian optimization to baseline methods.
Kalman Filter, also known as Linear Quadratic Estimation (LQE) is the algorithm that uses series of measurements that are observed over time and that contains statistical noise and other inaccuracies that are found in the given system. Copy the link given below and paste it in new browser window to get more information on Kalman Filter:- http://www.transtutors.com/homework-help/statistics/kalman-filter.aspx
The document discusses Kalman filters and their applications in tracking and data prediction. It provides an overview of the basic Kalman filter, which works optimally for linear models. It then describes the extended Kalman filter (EKF) which uses Taylor series linearization to apply the Kalman filter to nonlinear systems. Finally, it introduces the unscented Kalman filter (UKF) which uses the unscented transform for better linearization compared to the EKF when nonlinearities are large.
Gradient
Based Learning Applied to Document Recognition
Y
. LeCun , L. Bottou , Y. Bengio and P. Haffner
Proceedings of the IEEE, 86(11
):2278 ----‐2324 , November 1998
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
The document discusses Chapter 5 from the book "Data Mining: Concepts and Techniques" which covers frequent pattern mining, association rule mining, and correlation analysis. It provides an overview of basic concepts such as frequent patterns and association rules. It also describes efficient algorithms for mining frequent itemsets such as Apriori and FP-growth, and discusses challenges and improvements to frequent pattern mining.
This document provides an overview of digital image processing and human vision. It discusses the key stages of digital image processing including image acquisition, enhancement, restoration, morphological processing, segmentation, representation and description, object recognition, and compression. It also covers the anatomy of the human eye, photoreceptors, color perception, image formation in the eye, brightness adaptation, and the Weber ratio relating the just noticeable difference in light intensity to background intensity. The document uses images and diagrams from the textbook "Digital Image Processing" to illustrate concepts in digital images and the human visual system.
1. The document discusses the key elements of digital image processing including image acquisition, enhancement, restoration, segmentation, representation and description, recognition, and knowledge bases.
2. It also covers fundamentals of human visual perception such as the anatomy of the eye, image formation, brightness adaptation, color fundamentals, and color models like RGB and HSI.
3. The principles of video cameras are explained including the construction and working of the vidicon camera tube.
A* and Min-Max Searching Algorithms in AI , DSA.pdfCS With Logic
A* and Min-Max Searching Algorithms in AI. Search algorithms are algorithms designed to search for or retrieve elements from a data structure, where they are stored. It is a searching algorithm that is used to find the shortest path between an initial and a final point. Mini-Max algorithm is a recursive or backtracking algorithm that is used in decision-making and game theory.
The document provides an overview of artificial neural networks and supervised learning techniques. It discusses the biological inspiration for neural networks from neurons in the brain. Single-layer perceptrons and multilayer backpropagation networks are described for classification tasks. Methods to accelerate learning such as momentum and adaptive learning rates are also summarized. Finally, it briefly introduces recurrent neural networks like the Hopfield network for associative memory applications.
Visual odometry presentation material. In this presentation, there are two papers. "Omnidirectional visual odomtery of a planetry rovoer" written by peter corke and "Visual odometry for ground vehicle applications" written by David Nister.
* 딥러닝을 적용한 EEG 시스템 관련 논문을 간단하게 리뷰합니다.
* EEG에 관련된 대표적인 4가지 패러다임에 대해 소개합니다.
* 광주과학기술원 인공지능 스터디 A-GIST 모임에서 발표했습니다.
* 발표 영상 (한국어, 유튜브): https://youtu.be/gcP2vN41HZw
This document discusses subprograms (also called subroutines) in programming languages. It covers:
- The basic definitions and characteristics of subprograms, including headers, parameters, and local variables.
- Different parameter passing methods like pass-by-value, pass-by-reference, and their implementation in common languages.
- Additional features of subprograms including overloading, generics, and functions passing subprograms as parameters.
The document discusses the extended Kalman filter (EKF), which extends the standard Kalman filter to nonlinear systems through linearization. The EKF linearizes the system equations at each time step by taking the derivative of the nonlinear functions around the current state estimate. This results in an approximate linear system that can then be processed using the standard Kalman filter equations. The key steps of the EKF algorithm are to 1) compute the linearized system matrices using derivatives, 2) use these in a first-order Taylor approximation to linearize the system equations, and 3) apply the standard Kalman filter equations to this approximate linear system to recursively estimate the state.
Faster R-CNN improves object detection by introducing a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. The RPN slides over feature maps and predicts object bounds and objectness at each position. During training, anchors are assigned positive or negative labels based on Intersection over Union with ground truth boxes. Faster R-CNN runs the RPN in parallel with Fast R-CNN for detection, end-to-end in a single network and stage. This achieves state-of-the-art object detection speed and accuracy while eliminating computationally expensive selective search for proposals.
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...ssuser4b1f48
Nguyen Thanh Sang presented a paper on developing a simpler model called GraphMixer for temporal link prediction. GraphMixer achieves better performance than more complex baseline models like RNNs and self-attention networks. It uses a fixed time encoding function and MLP mixer to summarize link information, avoiding complex neural architectures. Experiments on five datasets show GraphMixer converges faster and generalizes better. The success of the simpler GraphMixer model suggests complex neural architectures and data processing may not always be needed for temporal network tasks.
Prof Alba shared parallel biological sequence alignment with the Smith-Waterman algorithm and present CUDAlign, our fine-grained multi-GPU strategy.. This project is part of Research project at University of Brasilia
Radial basis function network ppt bySheetal,Samreen and Dhanashrisheetal katkar
Radial Basis Functions are nonlinear activation functions used by artificial neural networks.Explained commonly used RBFs ,cover's theorem,interpolation problem and learning strategies.
Extreme learning machine:Theory and applicationsJames Chou
The document presents the extreme learning machine (ELM) theory and algorithm. ELM is a learning algorithm for single-hidden layer feedforward neural networks. Unlike traditional algorithms, ELM assigns input weights and hidden biases randomly and computes output weights through Moore-Penrose generalized inverse, making it faster than backpropagation. The paper evaluates ELM on regression and classification tasks, finding it achieves comparable or better performance than other algorithms while requiring less training time.
This document discusses Bayesian global optimization as a method for tuning machine learning models. It begins by outlining challenges with traditional tuning methods like grid search and random search. It then introduces Bayesian global optimization, which uses a Gaussian process model and expected improvement criterion to efficiently search the parameter space. The document provides examples of applying Bayesian optimization to deep learning tasks in MXNet and TensorFlow to achieve faster and better performance than traditional methods. It concludes by discussing tools for evaluating optimization strategies and comparing Bayesian optimization to baseline methods.
Kalman Filter, also known as Linear Quadratic Estimation (LQE) is the algorithm that uses series of measurements that are observed over time and that contains statistical noise and other inaccuracies that are found in the given system. Copy the link given below and paste it in new browser window to get more information on Kalman Filter:- http://www.transtutors.com/homework-help/statistics/kalman-filter.aspx
The document discusses Kalman filters and their applications in tracking and data prediction. It provides an overview of the basic Kalman filter, which works optimally for linear models. It then describes the extended Kalman filter (EKF) which uses Taylor series linearization to apply the Kalman filter to nonlinear systems. Finally, it introduces the unscented Kalman filter (UKF) which uses the unscented transform for better linearization compared to the EKF when nonlinearities are large.
Gradient
Based Learning Applied to Document Recognition
Y
. LeCun , L. Bottou , Y. Bengio and P. Haffner
Proceedings of the IEEE, 86(11
):2278 ----‐2324 , November 1998
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
The document discusses Chapter 5 from the book "Data Mining: Concepts and Techniques" which covers frequent pattern mining, association rule mining, and correlation analysis. It provides an overview of basic concepts such as frequent patterns and association rules. It also describes efficient algorithms for mining frequent itemsets such as Apriori and FP-growth, and discusses challenges and improvements to frequent pattern mining.
This document provides an overview of digital image processing and human vision. It discusses the key stages of digital image processing including image acquisition, enhancement, restoration, morphological processing, segmentation, representation and description, object recognition, and compression. It also covers the anatomy of the human eye, photoreceptors, color perception, image formation in the eye, brightness adaptation, and the Weber ratio relating the just noticeable difference in light intensity to background intensity. The document uses images and diagrams from the textbook "Digital Image Processing" to illustrate concepts in digital images and the human visual system.
1. The document discusses the key elements of digital image processing including image acquisition, enhancement, restoration, segmentation, representation and description, recognition, and knowledge bases.
2. It also covers fundamentals of human visual perception such as the anatomy of the eye, image formation, brightness adaptation, color fundamentals, and color models like RGB and HSI.
3. The principles of video cameras are explained including the construction and working of the vidicon camera tube.
A* and Min-Max Searching Algorithms in AI , DSA.pdfCS With Logic
A* and Min-Max Searching Algorithms in AI. Search algorithms are algorithms designed to search for or retrieve elements from a data structure, where they are stored. It is a searching algorithm that is used to find the shortest path between an initial and a final point. Mini-Max algorithm is a recursive or backtracking algorithm that is used in decision-making and game theory.
The document provides an overview of artificial neural networks and supervised learning techniques. It discusses the biological inspiration for neural networks from neurons in the brain. Single-layer perceptrons and multilayer backpropagation networks are described for classification tasks. Methods to accelerate learning such as momentum and adaptive learning rates are also summarized. Finally, it briefly introduces recurrent neural networks like the Hopfield network for associative memory applications.
Visual odometry presentation material. In this presentation, there are two papers. "Omnidirectional visual odomtery of a planetry rovoer" written by peter corke and "Visual odometry for ground vehicle applications" written by David Nister.
* 딥러닝을 적용한 EEG 시스템 관련 논문을 간단하게 리뷰합니다.
* EEG에 관련된 대표적인 4가지 패러다임에 대해 소개합니다.
* 광주과학기술원 인공지능 스터디 A-GIST 모임에서 발표했습니다.
* 발표 영상 (한국어, 유튜브): https://youtu.be/gcP2vN41HZw
This document discusses subprograms (also called subroutines) in programming languages. It covers:
- The basic definitions and characteristics of subprograms, including headers, parameters, and local variables.
- Different parameter passing methods like pass-by-value, pass-by-reference, and their implementation in common languages.
- Additional features of subprograms including overloading, generics, and functions passing subprograms as parameters.
The document discusses the extended Kalman filter (EKF), which extends the standard Kalman filter to nonlinear systems through linearization. The EKF linearizes the system equations at each time step by taking the derivative of the nonlinear functions around the current state estimate. This results in an approximate linear system that can then be processed using the standard Kalman filter equations. The key steps of the EKF algorithm are to 1) compute the linearized system matrices using derivatives, 2) use these in a first-order Taylor approximation to linearize the system equations, and 3) apply the standard Kalman filter equations to this approximate linear system to recursively estimate the state.
Faster R-CNN improves object detection by introducing a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. The RPN slides over feature maps and predicts object bounds and objectness at each position. During training, anchors are assigned positive or negative labels based on Intersection over Union with ground truth boxes. Faster R-CNN runs the RPN in parallel with Fast R-CNN for detection, end-to-end in a single network and stage. This achieves state-of-the-art object detection speed and accuracy while eliminating computationally expensive selective search for proposals.
NS-CUK Seminar: S.T.Nguyen, Review on "Do We Really Need Complicated Model Ar...ssuser4b1f48
Nguyen Thanh Sang presented a paper on developing a simpler model called GraphMixer for temporal link prediction. GraphMixer achieves better performance than more complex baseline models like RNNs and self-attention networks. It uses a fixed time encoding function and MLP mixer to summarize link information, avoiding complex neural architectures. Experiments on five datasets show GraphMixer converges faster and generalizes better. The success of the simpler GraphMixer model suggests complex neural architectures and data processing may not always be needed for temporal network tasks.
Prof Alba shared parallel biological sequence alignment with the Smith-Waterman algorithm and present CUDAlign, our fine-grained multi-GPU strategy.. This project is part of Research project at University of Brasilia
The document summarizes a Kaggle competition to forecast web traffic for Wikipedia articles. It discusses the goal of forecasting traffic for 145,000 articles, the evaluation metric used, an overview of the winner's solution using recurrent neural networks, and lessons learned. Key points include that the winner used a sequence-to-sequence model with GRU units to capture local and global patterns in the time series data, and employed techniques like model averaging to reduce variance.
This document discusses machine learning techniques for actuarial science, including supervised learning methods like linear regression, generalized linear models (GLMs), generalized additive models (GAMs), elastic net, classification and regression trees (CART), random forests, boosted models, and stacked ensembles. It also briefly mentions deep learning techniques like multi-layer perceptrons, convolutional neural networks, and recurrent neural networks, as well as natural language processing applications like word2vec. Key advantages and disadvantages of each method are summarized.
Spark is rapidly catching fire with the machine learning and data science community for a number of reasons. Predominantly, it is making it possible to extend and enhance machine learning algorithms to a level we’ve never seen before. In this talk, we’ll give examples of two areas Alpine Data Labs has contributed to the Spark project:
Bio:
DB Tsai is a Machine Learning Engineer working at Alpine Data Labs. His current focus is on Big Data, Data Mining, and Machine Learning. He uses Hadoop, Spark, Mahout, and several Machine Learning algorithms to build powerful, scalable, and robust cloud-driven applications. His favorite programming languages are Java, Scala, and Python. DB is a Ph.D. candidate in Applied Physics at Stanford University (currently taking leave of absence). He holds a Master’s degree in Electrical Engineering from Stanford University, as well as a Master's degree in Physics from National Taiwan University.
https://github.com/telecombcn-dl/dlmm-2017-dcu
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
Map-Reduce for Machine Learning on Multicoreillidan2004
This document proposes a map-reduce framework for parallelizing machine learning algorithms on multicore processors. The key ideas are:
1) Many machine learning algorithms can be expressed in "summation form" by computing sufficient statistics through summing over data points, allowing data to be partitioned across cores.
2) A map-reduce architecture is developed where data is split among "mappers", which compute partial sums in parallel, and a "reducer" aggregates the results.
3) Ten popular algorithms including linear regression, k-means, logistic regression, and neural networks are shown to fit this framework and achieve near-linear speedup with additional cores.
The document discusses several machine learning projects at NECST Research. It summarizes projects involving behavior identification in animals using models like XGBoost, muscle synergy identification using NMF and neural networks on FPGA, deep learning acceleration on embedded devices using HLS, spiking neural networks for robot simulation, CNN acceleration on FPGA using CONDOR, and the PRETZEL system for optimizing multiple similar ML models deployed on cloud platforms.
Machine learning in science and industry — day 4arogozhnikov
- tabular data approach to machine learning and when it didn't work
- convolutional neural networks and their application
- deep learning: history and today
- generative adversarial networks
- finding optimal hyperparameters
- joint embeddings
This document provides instructions for three exercises using artificial neural networks (ANNs) in Matlab: function fitting, pattern recognition, and clustering. It begins with background on ANNs including their structure, learning rules, training process, and common architectures. The exercises then guide using ANNs in Matlab for regression to predict house prices from data, classification of tumors as benign or malignant, and clustering of data. Instructions include loading data, creating and training networks, and evaluating results using both the GUI and command line. Improving results through retraining or adding neurons is also discussed.
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Databricks
This document summarizes an approach for joint optimization of AutoML and transfer learning. It discusses challenges with using AutoML for transfer learning due to limitations on the search space from pretrained models and inability to reuse models across datasets. The proposed approach uses AutoML to search for neural network architectures and hyperparameters based on pretrained models. It then fine-tunes the selected models on target datasets, achieving better accuracy and stability than traditional fine-tuning or standalone AutoML. Experimental results on image classification tasks demonstrate the advantages of the joint optimization approach.
On Implementation of Neuron Network(Back-propagation)Yu Liu
This document outlines Yu Liu's work implementing and comparing different parallel versions of a neural network using backpropagation. It discusses motivations for parallel programming practice and library study. It provides an introduction to neural networks and backpropagation algorithms. Three implementations are compared: sequential C++ STL, Skelton library, and Intel TBB. Benchmark results show improved speedups from parallel versions. Remaining challenges are also noted, like addressing local minima problems and testing on larger data.
1. The document presents a hybrid algorithm that combines Kernelized Fuzzy C-Means (KFCM), Hybrid Ant Colony Optimization (HACO), and Fuzzy Adaptive Particle Swarm Optimization (FAPSO) to improve clustering of electrocardiogram (ECG) beat data.
2. The algorithm maps data into a higher dimensional space using kernel functions to make clusters more linearly separable, addresses issues with KFCM being sensitive to initialization and prone to local minima.
3. It uses HACO to optimize cluster centers and membership degrees, and FAPSO to evaluate fitness values and optimize weight vectors, forming usable clusters for applications like ECG classification.
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...Databricks
This document summarizes research on hyper-parameter selection and adaptive model tuning for deep neural networks. It discusses various techniques for hyper-parameter selection like Bayesian optimization and reinforcement learning. It also describes implementing adaptive model tuning in production by monitoring models and advising on hyper-parameter changes in real-time. Joint optimization of autoML and fine-tuning is presented as an effective method. Interactive interfaces for visualizing training and tuning models are discussed.
Inside LoLA - Experiences from building a state space tool for place transiti...Universität Rostock
LoLA is a state space tool for analyzing place/transition nets that was developed starting in 1998. It uses various reduction techniques like stubborn sets, symmetries, and linear algebra to combat state space explosion. LoLA has been applied to problems in areas like model checking, business process verification, and distributed systems. Its core data structures and algorithms keep processing costs low during operations like firing transitions and state space traversal.
A Scaleable Implemenation of Deep Leaning on Spark- Alexander UlanovSpark Summit
This document summarizes research on implementing deep learning models using Spark. It describes:
1) Implementing a multilayer perceptron (MLP) model for digit recognition in Spark using batch processing and optimizing with native BLAS libraries.
2) Analyzing the tradeoff between computation and communication in parallelizing the gradient calculation for batch training across workers.
3) Benchmark results showing Spark MLP achieves similar performance to Caffe on CPU but scales better by utilizing multiple nodes, getting close to Caffe performance on GPU.
4) Ongoing work to incorporate more deep learning techniques like autoencoders and convolutional neural networks into Spark.
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovSpark Summit
This document summarizes research on implementing deep learning models using Spark. It describes:
1) Implementing a multilayer perceptron (MLP) model for digit recognition in Spark using batch processing and matrix optimizations to improve efficiency.
2) Analyzing the tradeoffs of computation and communication in parallelizing the gradient calculation for batch training across multiple nodes to find the optimal number of workers.
3) Benchmark results showing Spark MLP achieves similar performance to Caffe on a single node and outperforms it by scaling nearly linearly when using multiple nodes.
(Im2col)accelerating deep neural networks on low power heterogeneous architec...Bomm Kim
This document discusses accelerating deep neural networks on low power heterogeneous architectures. Specifically, it focuses on accelerating the inference time of the VGG-16 neural network on the ODROID-XU4 board, which contains an ARM CPU and Mali GPU. The authors develop parallel versions of VGG-16 using OpenMP for the CPU and OpenCL for the GPU. Several optimizations are explored in OpenCL, including work groups, vector data types, and the CLBlast library. The best OpenCL implementation achieves a 9.4x speedup over the original serial version.
Similar to Graph Signal Processing for Machine Learning A Review and New Perspectives - Part 2 (20)
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
2. Operations Strategy in a Global Environment.ppt
Graph Signal Processing for Machine Learning A Review and New Perspectives - Part 2
1. Graph Signal Processing for Machine
Learnin
g
A Review and New Perspectives
ICASSP Tutorial, June 2021
Xiaowen Dong, Dorina Thanou, Laura Toni
,
Michael Bronstein, Pascal Frossard
3. Outline
3
• Brief introduction to graph signal processing (GSP)
• Key GSP tools for machine learning
• Challenge I: GSP for exploiting data structure
• Challenge II: GSP for improving ef
fi
ciency and robustness
• Challenge III: GSP for enhancing model interpretability
• Applications
• Summary, open challenges, and new perspectives
4. Outline
3
• Brief introduction to graph signal processing (GSP)
• Key GSP tools for machine learning
• Challenge I: GSP for exploiting data structure
• Challenge II: GSP for improving ef
fi
ciency and robustness
• Challenge III: GSP for enhancing model interpretability
• Applications
• Summary, open challenges, and new perspectives
5. Outline
3
• Brief introduction to graph signal processing (GSP)
• Key GSP tools for machine learning
• Challenge I: GSP for exploiting data structure
• Challenge II: GSP for improving ef
fi
ciency and robustness
• Challenge III: GSP for enhancing model interpretability
• Applications
• Summary, open challenges, and new perspectives
6. GSP for improving ef
fi
ciency and robustness
B. Robustness against topological noise
C. Improvement on computational ef
fi
ciency
4
A. Improvement on data ef
fi
ciency and robustnes
s
7. GSP for improving ef
fi
ciency and robustness
B. Robustness against topological noise
fi
ciency and robustnes
s
8. A. Improvement on data ef
fi
ciency and robustness
• In many classical ML applications, data is scarce or noisy
5
9. A. Improvement on data ef
fi
ciency and robustness
• In many classical ML applications, data is scarce or noisy
5
• Need for prior knowledge to improve performance:
- Graph-based regularization could help in that direction
10. A.1. Graph signal smoothness as loss function
• Cross entropy loss is widely used in (semi-)supervised learning
frameworks. However:
- Underlying manifold structure is not preserved: Inputs of the same class tend to
be mapped to the same outpu
t
- One-hot-bit encoding is independent of the input distribution and network
initialization: slow training process
6
11. A.1. Graph signal smoothness as loss function
• Cross entropy loss is widely used in (semi-)supervised learning
frameworks. However:
- Underlying manifold structure is not preserved: Inputs of the same class tend to
be mapped to the same outpu
t
- One-hot-bit encoding is independent of the input distribution and network
initialization: slow training process
6
• An alternative GSP inspired loss function
:
- Maximizes the distance between images of distinct classes
- Minimizes a loss based on the smoothness of the label signals on a similarity
graph
12. A.1. Graph signal smoothness as loss function
• Cross entropy loss is widely used in (semi-)supervised learning
frameworks. However:
- Underlying manifold structure is not preserved: Inputs of the same class tend to
be mapped to the same outpu
t
- One-hot-bit encoding is independent of the input distribution and network
initialization: slow training process
6
Smoothness of label
signal for class c
<latexit sha1_base64="MHw39RZxbBulPYRL9+7KyYyxido=">AAACM3icbVDLSgMxFM34tr6qLt0Ei9CClpmiqAuh4EIRFxXtAzp1yKRpG8xkhjwGyjD/5MYfcSGIC0Xc+g9mahdavZDcw7nnkNzjR4xKZdvP1tT0zOzc/MJibml5ZXUtv77RkKEWmNRxyELR8pEkjHJSV1Qx0ooEQYHPSNO/O83mzZgISUN+o4YR6QSoz2mPYqQM5eUvrr2zovRwCZ5A025vLs1tsCt14CV6N3Yph40UNr0k3tUpdH3az/TFuAT3MkdRlzKudFvx8gW7bI8K/gXOGBTAuGpe/tHthlgHhCvMkJRtx45UJ0FCUcxImnO1JBHCd6hP2gZyFBDZSUY7p3DHMF3YC4U5XMER+9ORoEDKYeAbZYDUQE7OMvK/WVur3lEnoTzSinD8/VBPM6hCmAUIu1QQrNjQAIQFNX+FeIAEwsrEnDMhOJMr/wWNStk5KNtX+4Xq8TiOBbAFtkEROOAQVME5qIE6wOAePIFX8GY9WC/Wu/XxLZ2yxp5N8Kuszy80mKeZ</latexit>
SG(sc) = sT
c Lsc =
X
u,v2V
Wv,u sc(v) sc(u)
2
• An alternative GSP inspired loss function
:
- Maximizes the distance between images of distinct classes
- Minimizes a loss based on the smoothness of the label signals on a similarity
graph
13. A.1. Graph signal smoothness as loss function
• Cross entropy loss is widely used in (semi-)supervised learning
frameworks. However:
- Underlying manifold structure is not preserved: Inputs of the same class tend to
be mapped to the same outpu
t
- One-hot-bit encoding is independent of the input distribution and network
initialization: slow training process
6
<latexit sha1_base64="jDy8S+Shcq0QB6YPbKAmN3WDYtQ=">AAACXXicbVFNaxRBEO0ZjUnWmKx6yMFL4SLMgFlmxKAeAoEc4sFDRDcJ7KxDTW1PttmeD/ojZJnMn/Sml/yV9GxW0MSCph/vvaKqX2e1FNpE0S/Pf/R47cn6xmbv6daz7Z3+8xenurKK+IgqWanzDDWXouQjI4zk57XiWGSSn2Xzo04/u+RKi6r8bhY1nxR4UYpcEBpHpX2TFGhmhLL50qbNcTpvgzyEA0i0LdKGDuL2xxF8c0ob6JTCP4LDwWUI3WU7e/QWkrxSKCVQC/yqDvYSlPUMIbnOg6vUefegAzZMrsO0P4iG0bLgIYhXYMBWdZL2fybTimzBS0MStR7HUW0mDSojSPK2l1jNa6Q5XvCxgyUWXE+aZTotvHHMFNx27pQGluzfHQ0WWi+KzDm7LPR9rSP/p42tyT9OGlHW1vCS7gblVoKpoIsapkJxMnLhAJISblegGSok4z6k50KI7z/5ITh9N4z3h9HX94PDT6s4Ntgr9poFLGYf2CH7zE7YiBH77TFv0+t5N/6av+Vv31l9b9Xzkv1T/u4t7VCwwQ==</latexit>
LGk
(f) =
C
X
c=1
SG(sc) =
X
sc(v)sc(u)=0,8c
exp( ↵kf(xv) f(xu)k)
Smoothness of label
signal for class c
<latexit sha1_base64="MHw39RZxbBulPYRL9+7KyYyxido=">AAACM3icbVDLSgMxFM34tr6qLt0Ei9CClpmiqAuh4EIRFxXtAzp1yKRpG8xkhjwGyjD/5MYfcSGIC0Xc+g9mahdavZDcw7nnkNzjR4xKZdvP1tT0zOzc/MJibml5ZXUtv77RkKEWmNRxyELR8pEkjHJSV1Qx0ooEQYHPSNO/O83mzZgISUN+o4YR6QSoz2mPYqQM5eUvrr2zovRwCZ5A025vLs1tsCt14CV6N3Yph40UNr0k3tUpdH3az/TFuAT3MkdRlzKudFvx8gW7bI8K/gXOGBTAuGpe/tHthlgHhCvMkJRtx45UJ0FCUcxImnO1JBHCd6hP2gZyFBDZSUY7p3DHMF3YC4U5XMER+9ORoEDKYeAbZYDUQE7OMvK/WVur3lEnoTzSinD8/VBPM6hCmAUIu1QQrNjQAIQFNX+FeIAEwsrEnDMhOJMr/wWNStk5KNtX+4Xq8TiOBbAFtkEROOAQVME5qIE6wOAePIFX8GY9WC/Wu/XxLZ2yxp5N8Kuszy80mKeZ</latexit>
SG(sc) = sT
c Lsc =
X
u,v2V
Wv,u sc(v) sc(u)
2
• An alternative GSP inspired loss function
:
- Maximizes the distance between images of distinct classes
- Minimizes a loss based on the smoothness of the label signals on a similarity
graph
14. Experimental validation
• Clustered like embeddings for CIFAR-1
0
• Robustness to cluster deviations
7
Bontonou et al., “Introducing graph smoothness loss for training deep learning architectures,” IEEE Data Science
Workshop 2019
15. A.2. Latent geometry graphs
• Capture the latent geometry of intermediate layers of deep networks
using graphs
• Quantify the expressive power of each layer by measuring the label
variation
• Impose desirable properties into these representations by applying
GSP constraints
- Robustness: Enforce smooth label variations between consecutive layers
8
Lassance et al., “Laplacian networks: bounding indicator function smoothness for neural networks robustness”,
APSIPA Trans. on Sign. and Infor. Process., 2021
Lassance et al., “Representing Deep Neural Networks Latent Space Geometries with Graphs ”, Algorithms, 2021
16. A.3. Graph regularization for dealing with noisy labels
• Main challenge: Binary classi
fi
cation with noisy labels
• Proposed approach (DynGLR): A two step learning proces
s
- Graph learning: extract deep feature maps and learn a graph that maximizes/
minimizes similarity between two nodes that have same/opposite labels (G-Net
)
- Classi
fi
er learning: alternate between re
fi
ning the graph (H-Net) and restoring
the corrupted classi
fi
er signal (W-Net) through graph Laplacian regularization
(GLR)
9
17. A.3. Graph regularization for dealing with noisy labels
• Main challenge: Binary classi
fi
cation with noisy labels
• Proposed approach (DynGLR): A two step learning proces
s
- Graph learning: extract deep feature maps and learn a graph that maximizes/
minimizes similarity between two nodes that have same/opposite labels (G-Net
)
- Classi
fi
er learning: alternate between re
fi
ning the graph (H-Net) and restoring
the corrupted classi
fi
er signal (W-Net) through graph Laplacian regularization
(GLR)
9
<latexit sha1_base64="8RbEUZyKGk0dQYX0Frc/5V6t51c=">AAACKXicbVDLSgMxFM34tr6qLt0Ei6BIy0xR1IVQxo0LFxXsQzqdIZOmbTDJDElGKNP+jht/xY2Com79ETO1C18HAodzzuXmnjBmVGnbfrOmpmdm5+YXFnNLyyura/n1jbqKEolJDUcsks0QKcKoIDVNNSPNWBLEQ0Ya4c1Z5jduiVQ0Eld6EJM2Rz1BuxQjbaQgX7n2JTyFHpI9j1MRpO4IeiHt7XrDaz+VRWcEi9D1hkHZL8N96PHE5F3/6sKXbpbbC/IFu2SPAf8SZ0IKYIJqkH/yOhFOOBEaM6RUy7Fj3U6R1BQzMsp5iSIxwjeoR1qGCsSJaqfjS0dwxygd2I2keULDsfp9IkVcqQEPTZIj3Ve/vUz8z2slunvcTqmIE00E/lrUTRjUEcxqgx0qCdZsYAjCkpq/QtxHEmFtys2ZEpzfJ/8l9XLJOSzZlweFysmkjgWwBbbBLnDAEaiAc1AFNYDBHXgAz+DFurcerVfr/Ss6ZU1mNsEPWB+fKEWkEg==</latexit>
Y r
= arg min
B
kY r 1
Bk2
2 + µr
BT
Lr
B GLR
18. Experimental validation
• Classi
fi
cation error curves for different levels of noisy label
s
- Phoneme: contains nasal and oral sound
s
- CIFAR-10: subselection of two classes, airplane and ship
• Introducing GLR is bene
fi
cial especially in the high noise regime
10
Ye et al., “Robust Deep Graph Based Learning for Binary Classi
fi
cation” IEEE Tans. on Sign. and Inform.
Process. over Net. 2020
19. A.4. Graph regularization for zero shot learning
• Zero shot learning: Exploit models trained with supervision, and
some mapping to a semantic space, to recognise objects as belonging
to classes with unseen examples during trainin
g
• Key assumption: Regularize the space of the continuous manifolds
and the map between them by minimising the isoperimetric loss (IPL)
11
Visual embeddings
Semantic embeddings
<latexit sha1_base64="tmcolBD1/UXo39MYdOOLS5qBPs0=">AAAB63icbVA9SwNBEJ2LXzF+RS1tFoNgIeFOIlpYBGwsI5oPTI6wt9lLluzuHbt7QjjyF2wsFLH1D9n5b9xLrtDEBwOP92aYmRfEnGnjut9OYWV1bX2juFna2t7Z3SvvH7R0lChCmyTikeoEWFPOJG0aZjjtxIpiEXDaDsY3md9+okqzSD6YSUx9gYeShYxgk0mPZ+i+X664VXcGtEy8nFQgR6Nf/uoNIpIIKg3hWOuu58bGT7EyjHA6LfUSTWNMxnhIu5ZKLKj209mtU3RilQEKI2VLGjRTf0+kWGg9EYHtFNiM9KKXif953cSEV37KZJwYKsl8UZhwZCKUPY4GTFFi+MQSTBSztyIywgoTY+Mp2RC8xZeXSeu86l1U3btapX6dx1GEIziGU/DgEupwCw1oAoERPMMrvDnCeXHenY95a8HJZw7hD5zPHxmXjZk=</latexit>
Z, S
20. Graph-based approximation of IPL
• Isoperimetric loss (IPL): measures the
fl
ow through a closed
neighborhood relative to the area of its boundar
y
12
21. • A graph-based approximation of IPL:
- Treat visual embeddings as vertices in a graph, with af
fi
nities as edges,
and visual-to-embedding map as a linear function on the grap
h
- Seek a non-parametric deformation represented by changes in the connectivity
matrix, that minimises the IP
L
- Approximate IPL loss by using spectral reduction: Spectral graph wavelets of
the semantic embedding space
- Construct a new graph based on the spectral embedding
s
- Perform clustering to map clusters to labels
Graph-based approximation of IPL
• Isoperimetric loss (IPL): measures the
fl
ow through a closed
neighborhood relative to the area of its boundar
y
12
<latexit sha1_base64="7xt4b3CmVDnRxDG63sGidf+g/oA=">AAAB/nicbVDLSgNBEJyNrxhfq+LJy2CQRJCwK4q5CAEPeoxoHpCsYXYymwyZfTDTK4RlwV/x4kERr36HN//GSbIHjRY0FFXddHe5keAKLOvLyC0sLi2v5FcLa+sbm1vm9k5ThbGkrEFDEcq2SxQTPGAN4CBYO5KM+K5gLXd0OfFbD0wqHgZ3MI6Y45NBwD1OCWipZ+5d3SelFF/gcndIILlNj3GrdNQzi1bFmgL/JXZGiihDvWd+dvshjX0WABVEqY5tReAkRAKngqWFbqxYROiIDFhH04D4TDnJ9PwUH2qlj71Q6goAT9WfEwnxlRr7ru70CQzVvDcR//M6MXhVJ+FBFAML6GyRFwsMIZ5kgftcMgpirAmhkutbMR0SSSjoxAo6BHv+5b+keVKxzyrWzWmxVs3iyKN9dIDKyEbnqIauUR01EEUJekIv6NV4NJ6NN+N91pozspld9AvGxzdFIpO3</latexit>
G
0
= (Ŝ, W0
)
<latexit sha1_base64="8PArUU3gijpamSXDc22p2p6gwf0=">AAAB+HicbVBNS8NAEJ3Ur1o/GvXoZbEI9VISUexFKHjx4KFS+wFtKJvtpl262YTdjVBDf4kXD4p49ad489+4bXPQ1gcDj/dmmJnnx5wp7TjfVm5tfWNzK79d2Nnd2y/aB4ctFSWS0CaJeCQ7PlaUM0GbmmlOO7GkOPQ5bfvjm5nffqRSsUg86ElMvRAPBQsYwdpIfbvYG2GdNqboGg3Ld2eNvl1yKs4caJW4GSlBhnrf/uoNIpKEVGjCsVJd14m1l2KpGeF0WuglisaYjPGQdg0VOKTKS+eHT9GpUQYoiKQpodFc/T2R4lCpSeibzhDrkVr2ZuJ/XjfRQdVLmYgTTQVZLAoSjnSEZimgAZOUaD4xBBPJzK2IjLDERJusCiYEd/nlVdI6r7iXFef+olSrZnHk4RhOoAwuXEENbqEOTSCQwDO8wpv1ZL1Y79bHojVnZTNH8AfW5w/4L5Hz</latexit>
Ŝ = g(L)S
<latexit sha1_base64="fMsTCSc6iNqmBR5A9AKX8HFwNOs=">AAAB9HicbVBNSwMxEJ2tX7V+VT16CRbBU9kVxeKp4EGPFe0HdJeSTbNtaDZZk2yhLP0dXjwo4tUf481/Y9ruQasPBh7vzTAzL0w408Z1v5zCyura+kZxs7S1vbO7V94/aGmZKkKbRHKpOiHWlDNBm4YZTjuJojgOOW2Ho+uZ3x5TpZkUD2aS0CDGA8EiRrCxUuAnml2hG+Qbie575YpbdedAf4mXkwrkaPTKn35fkjSmwhCOte56bmKCDCvDCKfTkp9qmmAywgPatVTgmOogmx89RSdW6aNIKlvCoLn6cyLDsdaTOLSdMTZDvezNxP+8bmqiWpAxkaSGCrJYFKUc2RdnCaA+U5QYPrEEE8XsrYgMscLE2JxKNgRv+eW/pHVW9S6q7t15pV7L4yjCERzDKXhwCXW4hQY0gcAjPMELvDpj59l5c94XrQUnnzmEX3A+vgE9VJER</latexit>
: G ! S
<latexit sha1_base64="0xGFLkHF8pbRJjg2mpWGmf2Zm8M=">AAAB8XicbVBNSwMxEJ31s9avqkcvwSJUkLIrir0IBQ96rGA/aLuUbJptQ7PJkmSFsvRfePGgiFf/jTf/jWm7B219MPB4b4aZeUHMmTau++2srK6tb2zmtvLbO7t7+4WDw4aWiSK0TiSXqhVgTTkTtG6Y4bQVK4qjgNNmMLqd+s0nqjST4tGMY+pHeCBYyAg2VmrfoRtUap+j5lmvUHTL7gxomXgZKUKGWq/w1e1LkkRUGMKx1h3PjY2fYmUY4XSS7yaaxpiM8IB2LBU4otpPZxdP0KlV+iiUypYwaKb+nkhxpPU4CmxnhM1QL3pT8T+vk5iw4qdMxImhgswXhQlHRqLp+6jPFCWGjy3BRDF7KyJDrDAxNqS8DcFbfHmZNC7K3lXZfbgsVitZHDk4hhMogQfXUIV7qEEdCAh4hld4c7Tz4rw7H/PWFSebOYI/cD5/AKeujuo=</latexit>
G = (Z, W)
22. Experimental validation
• Embeddings in AWA
1
• IPL regularization generates more compact embedding
s
• Error decrease of 9.8% and 44% for AWA1 and CUB dataset
13
2D t-SNE, no IPL 2D t-SNE, with IPL
Deutch et al., “Zero shot learning with the isoperimetric loss”, AAAI, 2020
23. A.5. GSP for multi-task learning
• Multi-task learning: Learn simultaneously several related tasks
- Helps improve generalisation performance
• Often data are collected in a distributed fashio
n
- Each node can communicate only with local neighbors to solve a task
14
a) single task learning b) multi-task learning
Nassif et al., “Multi-task learning over graphs”, IEEE Signal Process. Mag., 2020
24. Spectral regularization for multi-task learning
• The graph captures the correlation between the tasks, i.e., nodes of
the graph
• The goal of each node is to compute the parameters that minimise
an objective function
• The relationship between the tasks can be exploited by imposing a
regularization of the cost function on the task graph
• Regularization examples
:
- Smoothness of tasks in the graph
- Graph spectral regularization
15
<latexit sha1_base64="r6f93I8FXWPyNIcB/xxSF92PiKU=">AAACCHicbZDLSgMxFIYzXmu9jbp0YbAIdVNmRFEXQsGNuKpgL9AZh0yatmGSzJBklDJ06cZXceNCEbc+gjvfxrSdhbb+EPj4zzmcnD9MGFXacb6tufmFxaXlwkpxdW19Y9Pe2m6oOJWY1HHMYtkKkSKMClLXVDPSSiRBPGSkGUaXo3rznkhFY3GrBwnxOeoJ2qUYaWMF9t5DEN058AJ6SPY8TkWQGWcIr4OobOAwsEtOxRkLzoKbQwnkqgX2l9eJccqJ0Jghpdquk2g/Q1JTzMiw6KWKJAhHqEfaBgXiRPnZ+JAhPDBOB3ZjaZ7QcOz+nsgQV2rAQ9PJke6r6drI/K/WTnX3zM+oSFJNBJ4s6qYM6hiOUoEdKgnWbGAAYUnNXyHuI4mwNtkVTQju9Mmz0DiquCcV5+a4VD3P4yiAXbAPysAFp6AKrkAN1AEGj+AZvII368l6sd6tj0nrnJXP7IA/sj5/AB3BmLw=</latexit>
w0
k = arg min
wk
Jk(wk)
<latexit sha1_base64="SYHQnvASBWBTJ7Joczb8IMqE06k=">AAAB6HicbVA9SwNBEJ2LXzF+RS1tFoNgFe5E0cIiYGOZgPmA5Ah7m7lkzd7esbsnhCO/wMZCEVt/kp3/xk1yhSY+GHi8N8PMvCARXBvX/XYKa+sbm1vF7dLO7t7+QfnwqKXjVDFssljEqhNQjYJLbBpuBHYShTQKBLaD8d3Mbz+h0jyWD2aSoB/RoeQhZ9RYqTHulytu1Z2DrBIvJxXIUe+Xv3qDmKURSsME1brruYnxM6oMZwKnpV6qMaFsTIfYtVTSCLWfzQ+dkjOrDEgYK1vSkLn6eyKjkdaTKLCdETUjvezNxP+8bmrCGz/jMkkNSrZYFKaCmJjMviYDrpAZMbGEMsXtrYSNqKLM2GxKNgRv+eVV0rqoeldVt3FZqd3mcRThBE7hHDy4hhrcQx2awADhGV7hzXl0Xpx352PRWnDymWP4A+fzB9IJjO0=</latexit>
k
<latexit sha1_base64="p9SpMXVuM3Kf+Nud1SVYSc4hny4=">AAACNnicbVDLSiNBFK32NU58RV3OpjAIiUroFgd1IQizGBEUFWML6aSprlTHoquqm6pqJRT9VW7mO2bnxoUibv0Eq2MWvg5cOJxzL/feE2WMKu26d87Y+MTk1I/pn5WZ2bn5heri0oVKc4lJC6cslZcRUoRRQVqaakYuM0kQjxjxo+RP6fvXRCqainM9yEiHo76gMcVIWymsHvndNbgHAyT7AaciNH4BD7umz9KoqPuN0lI5D02y5xXd48Mwqd+ESQOuB7FE2AREo8JsFmd1fwP+bVTCas1tukPAr8QbkRoY4SSs/g96Kc45ERozpFTbczPdMUhqihkpKkGuSIZwgvqkbalAnKiOGb5dwFWr9GCcSltCw6H6fsIgrtSAR7aTI32lPnul+J3XznW80zFUZLkmAr8tinMGdQrLDGGPSoI1G1iCsKT2VoivkA1E26TLELzPL38lF5tN73fTPd2q7e+O4pgGv8AKqAMPbIN9cABOQAtgcAvuwAN4dP45986T8/zWOuaMZpbBBzgvr1VKqTY=</latexit>
W⇤
= arg min
W
Jglob
(W) =
N
X
k=1
Jk(wk) +
⌘
2
R(W, G)
<latexit sha1_base64="ud2SGMnpgd1lLqFufKkIH9ZDiH4=">AAAB+nicbVBNSwJRFL1jX2ZfYy3bPJLAIGQmitwEQotatLBQR9BJ3jyf+vDNB++9KWTyp7RpUUTbfkm7/k1PnUVpBy4czrmXe+/xIs6ksqxvI7O0vLK6ll3PbWxube+Y+d2GDGNBaJ2EPBRND0vKWUDriilOm5Gg2Pc4dbzh5cR3HqiQLAxqahRR18f9gPUYwUpLHTN/V3SO0dURukDOfQ3dIKdjFqySNQVaJHZKCpCi2jG/2t2QxD4NFOFYypZtRcpNsFCMcDrOtWNJI0yGuE9bmgbYp9JNpqeP0aFWuqgXCl2BQlP190SCfSlHvqc7fawGct6biP95rVj1ym7CgihWNCCzRb2YIxWiSQ6oywQlio80wUQwfSsiAywwUTqtnA7Bnn95kTROSvZZybo9LVTKaRxZ2IcDKIIN51CBa6hCHQg8wjO8wpvxZLwY78bHrDVjpDN78AfG5w8V+JFF</latexit>
R(W, G) = WT
LW
<latexit sha1_base64="d7vS7+Ai8B3yKZQg0Dknb5AgP/I=">AAAB/XicbVDLSgNBEOz1GeNrfdy8DAYhAQm7opiLEPCgBw9RkmwgWcPsZDYZMvtgZlaIIfgrXjwo4tX/8ObfOEn2oIkFDUVVN91dXsyZVJb1bSwsLi2vrGbWsusbm1vb5s5uXUaJILRGIh6Jhocl5SykNcUUp41YUBx4nDpe/3LsOw9USBaFVTWIqRvgbsh8RrDSUtvcv8s7x+iqgC6Qc19F3fxNATltM2cVrQnQPLFTkoMUlbb51epEJAloqAjHUjZtK1buEAvFCKejbCuRNMakj7u0qWmIAyrd4eT6ETrSSgf5kdAVKjRRf08McSDlIPB0Z4BVT856Y/E/r5kov+QOWRgnioZkushPOFIRGkeBOkxQovhAE0wE07ci0sMCE6UDy+oQ7NmX50n9pGifFa3b01y5lMaRgQM4hDzYcA5luIYK1IDAIzzDK7wZT8aL8W58TFsXjHRmD/7A+PwBqTCSGw==</latexit>
R(W, G) = WT
g(L)W
25. Summary so far
• Graphs can be used to capture hidden dependencies of any type of
data
• The domain of the graph is application speci
fi
c
- Latent space of feature
s
- Manifold approximation
- Distributed agents
• GSP tools, and in particular, graph regularization help towards
imposing desirable propertie
s
- Better node embedding
s
- Cleaner signals
- More robust models
16
26. GSP for improving ef
fi
ciency and robustness
A. Improvement on data ef
fi
ciency and robustness
C. Improvement on computational ef
fi
ciency
17
B. Robustness against topological noise
27. GSP for improving ef
fi
ciency and robustness
A. Improvement on data ef
28. B. Robustness against topological noise
• In the context of graph-structured data, stability can be de
fi
ned with
respect to perturbation to the underlying topology
18
29. B. Robustness against topological noise
• In the context of graph-structured data, stability can be de
fi
ned with
respect to perturbation to the underlying topology
18
Why is it important
?
• noisy/unreliable
graph
• adversarie
s
• transferabilit
y
• changes through
time
30. 19
B.1. Stability of graph
fi
lters
Levie et al., “On the transferability of spectral graph
fi
lters,” SampTA, 2019.
32. B.2. Stability of graph neural networks
21
Gama et al., “Stability properties of graph neural networks,” IEEE TSP, 2020
.
Ruiz et al., “Graph neural networks: Architectures, stability, and transferability,” Proceedings of the IEEE, 2021.
33. Stability of graph neural networks
22
Gama et al., “Stability properties of graph neural networks,” IEEE TSP, 2020
.
Ruiz et al., “Graph neural networks: Architectures, stability, and transferability,” Proceedings of the IEEE, 2021.
34. Stability of graph neural networks
• Perturbations that do not modify the degree distribution of the graph
23
Kenlay et al., “On the stability of graph convolutional neural networks under edge rewiring,” ICASSP, 2021.
• stability under double-edge rewiring
35. B.3. Interpretable stability bounds
24
Kenlay et al., “Interpretable stability bounds for spectral graph
fi
lters,” ICML, 2021.
PGD
Edge addition
Edge deletion
Robust PGD
Edge addition
Edge deletion
Robust
• how does stability depend on topological properties of
perturbation?
BA graph 3-regular graph
36. B.3. Interpretable stability bounds
24
Kenlay et al., “Interpretable stability bounds for spectral graph
fi
lters,” ICML, 2021.
PGD
Edge addition
Edge deletion
Robust PGD
Edge addition
Edge deletion
Robust
• how does stability depend on topological properties of
perturbation?
• spectral graph
fi
lters are most stable i
f
- adding or deleting edges between high degree node
s
- not perturbing too much around any one node
BA graph 3-regular graph
37. B.4. Robustness beyond stability
• GSP in the presence of topological uncertainty (more in part III) [2, 4
]
• Filters on stochastic time-evolving graphs [1
]
• Stochastic graph
fi
lters built on sequence of randomly perturbed
graphs [3]
25
[1] Isu
fi
et al., “Filtering random graph processes over random time-varying graphs,” IEEE TSP, 2017
.
[2] Ceci and Barbarossa, “Graph signal processing in the presence of topology uncertainties,” IEEE TSP, 2020
.
[3] Gao et al., “Stochastic graph neural networks,” ICASSP, 2020
.
[4] Miettinen et al., “Modelling graph errors: Towards robust graph signal processing,” arXiv, 2020.
38. GSP for improving ef
fi
ciency and robustness
C. Improvement on computational ef
fi
ciency
A. Improvement on data ef
fi
ciency and robustness
B. Robustness against topological noise
26
39. GSP for improving ef
fi
ciency and robustness
C. Improvement on computational ef
fi
ciency
A. Improvement on data ef
40. - Eigendecomposition of a large matrix is expensiv
e
- Training and deploying GNNs remains challenging due to high memory
consumption and inference latency
27
https://en.wikipedia.org/wiki/Social_network_analysis
41. C. Improvement on computational complexity
• Many ML algorithms require a large amount of resources (i.e., space,
runtime, communication) for their computation
- Eigendecomposition of a large matrix is expensiv
e
- Training and deploying GNNs remains challenging due to high memory
consumption and inference latency
27
Can we use the graph structure and classical GSP operators to
alleviate some of these issues?
https://en.wikipedia.org/wiki/Social_network_analysis
42. C.1. Complexity of spectral clustering
• Spectral clustering : Given a graph capturing the structure of
data points,
fi
nd the partition of the nodes into cluster
s
• Algorithm: Given the graph Laplacian matrix
- Compute the
fi
rst eigenvector
s
- Compute the embedding of each node in
:
- Run K-Means with Euclidean distance on the embeddings
to compute the cluster
s
• Bottlenecks: When are larg
e
- Computing the eigendecomposition is expensiv
e
- The complexity of K-means is high
28
<latexit sha1_base64="lG8/m/tuw4KYjUZGrvnE+q/C4y4=">AAAB6HicbVA9SwNBEJ3zM8avqKXNYhCswp0oWlgELLRMwHxAcoS9zVyyZm/v2N0TwpFfYGOhiK0/yc5/4ya5QhMfDDzem2FmXpAIro3rfjsrq2vrG5uFreL2zu7efungsKnjVDFssFjEqh1QjYJLbBhuBLYThTQKBLaC0e3Ubz2h0jyWD2acoB/RgeQhZ9RYqX7XK5XdijsDWSZeTsqQo9YrfXX7MUsjlIYJqnXHcxPjZ1QZzgROit1UY0LZiA6wY6mkEWo/mx06IadW6ZMwVrakITP190RGI63HUWA7I2qGetGbiv95ndSE137GZZIalGy+KEwFMTGZfk36XCEzYmwJZYrbWwkbUkWZsdkUbQje4svLpHle8S4rbv2iXL3J4yjAMZzAGXhwBVW4hxo0gAHCM7zCm/PovDjvzse8dcXJZ47gD5zPH5t5jMk=</latexit>
G
<latexit sha1_base64="9G4Yhmc71Jj9i88hAaoQMRNdfH0=">AAAB6HicbVA9SwNBEJ3zM8avqKXNYhCswp0oWlgEbKwkAfMByRH2NnPJmr29Y3dPCEd+gY2FIrb+JDv/jZvkCk18MPB4b4aZeUEiuDau++2srK6tb2wWtorbO7t7+6WDw6aOU8WwwWIRq3ZANQousWG4EdhOFNIoENgKRrdTv/WESvNYPphxgn5EB5KHnFFjpfp9r1R2K+4MZJl4OSlDjlqv9NXtxyyNUBomqNYdz02Mn1FlOBM4KXZTjQllIzrAjqWSRqj9bHbohJxapU/CWNmShszU3xMZjbQeR4HtjKgZ6kVvKv7ndVITXvsZl0lqULL5ojAVxMRk+jXpc4XMiLEllClubyVsSBVlxmZTtCF4iy8vk+Z5xbusuPWLcvUmj6MAx3ACZ+DBFVThDmrQAAYIz/AKb86j8+K8Ox/z1hUnnzmCP3A+fwCmFYzQ</latexit>
N
<latexit sha1_base64="YMl63/TQai6qmJHmFj6L7Gqyfns=">AAAB6HicbVA9SwNBEJ3zM8avqKXNYhCswp0oWlgEbASbBMwHJEfY28wla/b2jt09IRz5BTYWitj6k+z8N26SKzTxwcDjvRlm5gWJ4Nq47rezsrq2vrFZ2Cpu7+zu7ZcODps6ThXDBotFrNoB1Si4xIbhRmA7UUijQGArGN1O/dYTKs1j+WDGCfoRHUgeckaNler3vVLZrbgzkGXi5aQMOWq90le3H7M0QmmYoFp3PDcxfkaV4UzgpNhNNSaUjegAO5ZKGqH2s9mhE3JqlT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeO1nXCapQcnmi8JUEBOT6dekzxUyI8aWUKa4vZWwIVWUGZtN0YbgLb68TJrnFe+y4tYvytWbPI4CHMMJnIEHV1CFO6hBAxggPMMrvDmPzovz7nzMW1ecfOYI/sD5/AGhiYzN</latexit>
K
Data points Graph structure Detected clusters
<latexit sha1_base64="4WCNov6V0zGqZSoHlnZclCgpAcI=">AAAB6HicbVA9SwNBEJ3zM8avqKXNYhCswp0oWlgEbCwsEjAfkBxhbzOXrNnbO3b3hHDkF9hYKGLrT7Lz37hJrtDEBwOP92aYmRckgmvjut/Oyura+sZmYau4vbO7t186OGzqOFUMGywWsWoHVKPgEhuGG4HtRCGNAoGtYHQ79VtPqDSP5YMZJ+hHdCB5yBk1Vqrf90plt+LOQJaJl5My5Kj1Sl/dfszSCKVhgmrd8dzE+BlVhjOBk2I31ZhQNqID7FgqaYTaz2aHTsipVfokjJUtachM/T2R0UjrcRTYzoiaoV70puJ/Xic14bWfcZmkBiWbLwpTQUxMpl+TPlfIjBhbQpni9lbChlRRZmw2RRuCt/jyMmmeV7zLilu/KFdv8jgKcAwncAYeXEEV7qAGDWCA8Ayv8OY8Oi/Ou/Mxb11x8pkj+APn8wejDYzO</latexit>
L
<latexit sha1_base64="aVetB5EXBB9JZ00ye6Q1GtusioY=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4Koko9ljwInhpwX5AG8pmO2nXbjZhdyOU0F/gxYMiXv1J3vw3btsctPXBwOO9GWbmBYng2rjut7O2vrG5tV3YKe7u7R8clo6OWzpOFcMmi0WsOgHVKLjEpuFGYCdRSKNAYDsY38789hMqzWP5YCYJ+hEdSh5yRo2VGvf9UtmtuHOQVeLlpAw56v3SV28QszRCaZigWnc9NzF+RpXhTOC02Es1JpSN6RC7lkoaofaz+aFTcm6VAQljZUsaMld/T2Q00noSBbYzomakl72Z+J/XTU1Y9TMuk9SgZItFYSqIicnsazLgCpkRE0soU9zeStiIKsqMzaZoQ/CWX14lrcuKd11xG1flWjWPowCncAYX4MEN1OAO6tAEBgjP8ApvzqPz4rw7H4vWNSefOYE/cD5/AKBVjMk=</latexit>
K
<latexit sha1_base64="npjqqTLsIfJNa+vNNy/pUjoUBC0=">AAACA3icbZDLSsNAFIYnXtt6i7oR3QwWwUUJSUHsRii6EbqpYNNCG8JkOmmHTi7MTIQaCt34Km5cKMWtL+FO8GGctF1o6w8DH/85hzPn92JGhTTNL21ldW19YzOXL2xt7+zu6fsHtogSjkkDRyziLQ8JwmhIGpJKRloxJyjwGGl6g5us3nwgXNAovJfDmDgB6oXUpxhJZbn6se3W4BVs265VgrZbLkHDMDKqOa5eNA1zKrgM1hyK1aPH7/x4cl139c9ON8JJQEKJGRKibZmxdFLEJcWMjAqdRJAY4QHqkbbCEAVEOOn0hhE8U04X+hFXL5Rw6v6eSFEgxDDwVGeAZF8s1jLzv1o7kX7FSWkYJ5KEeLbITxiUEcwCgV3KCZZsqABhTtVfIe4jjrBUsRVUCNbiyctglw3rwjDvrGK1AmbKgRNwCs6BBS5BFdyCOmgADMbgGbyCN+1Je9Em2vusdUWbzxyCP9I+fgCaLJdB</latexit>
VK = [V1, V2, ..., VK]
<latexit sha1_base64="rQIWe/ZeKLEVmRNgWjZ+GiHG+l4=">AAAB6HicbZC7SgNBFIbPxluMt6ilIItBsAq7gpjOgI1lAuYCyRJmJ2eTMbOzy8ysEJaUVjYWitj6FKl8CDufwZdwcik08YeBj/8/hznn+DFnSjvOl5VZWV1b38hu5ra2d3b38vsHdRUlkmKNRjySTZ8o5ExgTTPNsRlLJKHPseEPrid54x6lYpG41cMYvZD0BAsYJdpYVdHJF5yiM5W9DO4cClcf4+r3w/G40sl/trsRTUIUmnKiVMt1Yu2lRGpGOY5y7URhTOiA9LBlUJAQlZdOBx3Zp8bp2kEkzRPanrq/O1ISKjUMfVMZEt1Xi9nE/C9rJTooeSkTcaJR0NlHQcJtHdmTre0uk0g1HxogVDIzq037RBKqzW1y5gju4srLUD8vuhdFp+oWyiWYKQtHcAJn4MIllOEGKlADCgiP8Awv1p31ZL1ab7PSjDXvOYQ/st5/AJz5kTU=</latexit>
n
<latexit sha1_base64="CgJ6ISMxcuZ4V82taY/EjRVO6ZA=">AAAB6nicbZDLSsNAFIZP6q1Gq1WXbgZLwVVJBLHLgiCCm4r2Am0ok+mkHTqZhLkIJfQR3LhQxK34ID6CO9/G6WWhrT8MfPz/Ocw5J0w5U9rzvp3c2vrG5lZ+293ZLeztFw8OmyoxktAGSXgi2yFWlDNBG5ppTtuppDgOOW2Fo8tp3nqgUrFE3OtxSoMYDwSLGMHaWnfN3k2vWPIq3kxoFfwFlGqFT1O+cj/qveJXt58QE1OhCcdKdXwv1UGGpWaE04nbNYqmmIzwgHYsChxTFWSzUSeobJ0+ihJpn9Bo5v7uyHCs1DgObWWM9VAtZ1Pzv6xjdFQNMiZSo6kg848iw5FO0HRv1GeSEs3HFjCRzM6KyBBLTLS9jmuP4C+vvArNs4p/XvFu/VKtCnPl4RhO4BR8uIAaXEMdGkBgAI/wDC8Od56cV+dtXppzFj1H8EfO+w+DVZAu</latexit>
VK
<latexit sha1_base64="t3o3FSibv3vdoC4tdqDAt3NQYR8=">AAACAHicbVDLSsNAFJ3UV62vqqCgm8EquCqJIHYjFNwIbio0baGJYTKZtEMnkzAzEUroxu9w58aFIm7Fr3DnHwj+hNPHQlsPXDiccy/33uMnjEplmp9Gbm5+YXEpv1xYWV1b3yhubjVknApMbByzWLR8JAmjnNiKKkZaiSAo8hlp+r2Lod+8JULSmNdVPyFuhDqchhQjpSWvuOskXepxeA5t7+qmDp2AMIU87ZTMsjkCnCXWhJSqe99f7zv3hzWv+OEEMU4jwhVmSMq2ZSbKzZBQFDMyKDipJAnCPdQhbU05ioh0s9EDA3iklQCGsdDFFRypvycyFEnZj3zdGSHVldPeUPzPa6cqrLgZ5UmqCMfjRWHKoIrhMA0YUEGwYn1NEBZU3wpxFwmElc6soEOwpl+eJY2TsnVaNq+tUrUCxsiDfXAAjoEFzkAVXIIasAEGA/AAnsCzcWc8Gi/G67g1Z0xmtsEfGG8/NR+ZMg==</latexit>
n = UT
K n
von Luxburg, “A tutorial on spectral clustering”, Statistics and Computing, 2007
<latexit sha1_base64="yQD8RvOSbjQEP7ARDyGpNHI6280=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4kJKIogcPBS+CIBVMW2hD2Ww37dLdTdjdCCX0N3jxoIhXf5A3/42bNgdtfTDweG+GmXlhwpk2rvvtLC2vrK6tlzbKm1vbO7uVvf2mjlNFqE9iHqt2iDXlTFLfMMNpO1EUi5DTVji6yf3WE1WaxfLRjBMaCDyQLGIEGyv596fortyrVN2aOwVaJF5BqlCg0at8dfsxSQWVhnCsdcdzExNkWBlGOJ2Uu6mmCSYjPKAdSyUWVAfZ9NgJOrZKH0WxsiUNmqq/JzIstB6L0HYKbIZ63svF/7xOaqKrIGMySQ2VZLYoSjkyMco/R32mKDF8bAkmitlbERlihYmx+eQhePMvL5LmWc27qLkP59X6dRFHCQ7hCE7Ag0uowy00wAcCDJ7hFd4c6bw4787HrHXJKWYO4A+czx8wGo2Z</latexit>
N, K <latexit sha1_base64="m1Kucc5AEjVEMHBGxL3IlDwNbe8=">AAAB73icbVBNS8NAEJ3Ur1q/qh69LBahXkpSFD14KHgRBK1gP6CNZbPdtEs3m7i7EUron/DiQRGv/h1v/hs3bQ7a+mDg8d4MM/O8iDOlbfvbyi0tr6yu5dcLG5tb2zvF3b2mCmNJaIOEPJRtDyvKmaANzTSn7UhSHHictrzRZeq3nqhULBT3ehxRN8ADwXxGsDZS+7Z8c/1QPS70iiW7Yk+BFomTkRJkqPeKX91+SOKACk04Vqrj2JF2Eyw1I5xOCt1Y0QiTER7QjqECB1S5yfTeCToySh/5oTQlNJqqvycSHCg1DjzTGWA9VPNeKv7ndWLtn7sJE1GsqSCzRX7MkQ5R+jzqM0mJ5mNDMJHM3IrIEEtMtIkoDcGZf3mRNKsV57Ri352UahdZHHk4gEMogwNnUIMrqEMDCHB4hld4sx6tF+vd+pi15qxsZh/+wPr8Af2ujps=</latexit>
O(NK2
)
<latexit sha1_base64="m1Kucc5AEjVEMHBGxL3IlDwNbe8=">AAAB73icbVBNS8NAEJ3Ur1q/qh69LBahXkpSFD14KHgRBK1gP6CNZbPdtEs3m7i7EUron/DiQRGv/h1v/hs3bQ7a+mDg8d4MM/O8iDOlbfvbyi0tr6yu5dcLG5tb2zvF3b2mCmNJaIOEPJRtDyvKmaANzTSn7UhSHHictrzRZeq3nqhULBT3ehxRN8ADwXxGsDZS+7Z8c/1QPS70iiW7Yk+BFomTkRJkqPeKX91+SOKACk04Vqrj2JF2Eyw1I5xOCt1Y0QiTER7QjqECB1S5yfTeCToySh/5oTQlNJqqvycSHCg1DjzTGWA9VPNeKv7ndWLtn7sJE1GsqSCzRX7MkQ5R+jzqM0mJ5mNDMJHM3IrIEEtMtIkoDcGZf3mRNKsV57Ri352UahdZHHk4gEMogwNnUIMrqEMDCHB4hld4sx6tF+vd+pi15qxsZh/+wPr8Af2ujps=</latexit>
O(NK2
)
<latexit sha1_base64="p5bpnj8sJcLrP380zrfcvO5JfnM=">AAACCHicbZDLSgMxFIYz9VbrbdSlC4NFcKFlpih2IxR04bKCvUBnGDJp2oYmmSHJCGXapRtfxY0LRdz6CO58G9NpF9r6Q+DjP+dwcv4wZlRpx/m2ckvLK6tr+fXCxubW9o69u9dQUSIxqeOIRbIVIkUYFaSuqWakFUuCeMhIMxxcT+rNByIVjcS9HsbE56gnaJdipI0V2Ic3QSpO+RheQW/kxX0aCHgGM+DeKCgHdtEpOZngIrgzKIKZaoH95XUinHAiNGZIqbbrxNpPkdQUMzIueIkiMcID1CNtgwJxovw0O2QMj43Tgd1Imic0zNzfEyniSg15aDo50n01X5uY/9Xaie5W/JSKONFE4OmibsKgjuAkFdihkmDNhgYQltT8FeI+kghrk13BhODOn7wIjXLJvSg5d+fFamUWRx4cgCNwAlxwCargFtRAHWDwCJ7BK3iznqwX6936mLbmrNnMPvgj6/MHt/GYew==</latexit>
Dn,m = k n mk2
43. Compressive spectral clustering
• A GSP perspectiv
e
- Computation of eigenvectors is bypassed by
fi
ltering random graph signals
- K-means is performed on a subset of randomly selected node
s
- The cluster centers of the whole graph are interpolate
d
• Algorithm:
- Estimate fro
m
- Generate random graph signals in matri
x
- Filter them with a polynomial
fi
lter , i.e.,
- If
- Draw randomly sample nodes, and run K-means to obtain cluster
center
s
- Interpolate the centers by exploiting the sampling theory of bandlimited graph
signals
29
<latexit sha1_base64="oZ9+8Oq+F9d+NMa1aTxqVSN5Wdo=">AAAB+nicbVDLSgMxFM3UV62vqV26CZaCqzIjiF0W3AjdVLAP6Awlk2ba0GQyJBm1jN35G25cKOLWb/AD3OkH+AV+gOljoa0HLhzOuZd77wliRpV2nA8rs7K6tr6R3cxtbe/s7tn5/aYSicSkgQUTsh0gRRiNSENTzUg7lgTxgJFWMDyb+K0rIhUV0aUexcTnqB/RkGKkjdS18xx6KI6luIE1j4k+rHXtolN2poDLxJ2TYrVQuvt++/qsd+13rydwwkmkMUNKdVwn1n6KpKaYkXHOSxSJER6iPukYGiFOlJ9OTx/DklF6MBTSVKThVP09kSKu1IgHppMjPVCL3kT8z+skOqz4KY3iRJMIzxaFCYNawEkOsEclwZqNDEFYUnMrxAMkEdYmrZwJwV18eZk0j8vuSdm5cIvVCpghCw7AITgCLjgFVXAO6qABMLgG9+ARPFm31oP1bL3MWjPWfKYA/sB6/QEs5pe5</latexit>
m ⇡ K log K
<latexit sha1_base64="BeWjLlkDkSsPlDgjnzrFa7oBMek=">AAAB8HicbVDLSgMxFL3js45Wqy7dBEvBVZkRxC4LgrisYB/SDiWTybShSWZIMkIZ+hVuXCjixoWf4Se4829MHwttPXDhcM653EeYcqaN5307a+sbm1vbhR13d6+4f1A6PGrpJFOENknCE9UJsaacSdo0zHDaSRXFIuS0HY6upn77gSrNEnlnxikNBB5IFjOCjZXue9xGI9wf9Utlr+rNgFaJvyDlevEzq1y7741+6asXJSQTVBrCsdZd30tNkGNlGOF04vYyTVNMRnhAu5ZKLKgO8tnCE1SxSoTiRNmSBs3U3x05FlqPRWiTApuhXvam4n9eNzNxLciZTDNDJZkPijOOTIKm16OIKUoMH1uCiWJ2V0SGWGFi7I9c+wR/+eRV0jqv+hdV79Yv12swRwFO4BTOwIdLqMMNNKAJBAQ8wjO8OMp5cl6dt3l0zVn0HMMfOB8/Qs+S8Q==</latexit>
k
<latexit sha1_base64="vOjr189k79o/loNSrTdd5UoYf58=">AAAB6HicbZC7SgNBFIbPxluMt6ilIItBsAq7gpjOgI2FRQLmAskSZidnkzGzs8vMrBCWlFY2ForY+hSpfAg7n8GXcHIpNPrDwMf/n8Occ/yYM6Ud59PKLC2vrK5l13Mbm1vbO/ndvbqKEkmxRiMeyaZPFHImsKaZ5tiMJZLQ59jwB5eTvHGHUrFI3OhhjF5IeoIFjBJtrOp1J19wis5U9l9w51C4eB9Xv+4Px5VO/qPdjWgSotCUE6VarhNrLyVSM8pxlGsnCmNCB6SHLYOChKi8dDroyD42TtcOImme0PbU/dmRklCpYeibypDovlrMJuZ/WSvRQclLmYgTjYLOPgoSbuvInmxtd5lEqvnQAKGSmVlt2ieSUG1ukzNHcBdX/gv106J7VnSqbqFcgpmycABHcAIunEMZrqACNaCA8ABP8GzdWo/Wi/U6K81Y8559+CXr7RtpcZET</latexit>
L
<latexit sha1_base64="6wRg1FGqPupP4V/JYIbq3pYUego=">AAAB6HicbZC7SgNBFIbPxluMt6ilIItBsAq7gpjOgI1lAuYCSQizs2eTMbOzy8ysEJaUVjYWitj6FKl8CDufwZdwcik08YeBj/8/hznneDFnSjvOl5VZWV1b38hu5ra2d3b38vsHdRUlkmKNRjySTY8o5ExgTTPNsRlLJKHHseENrid54x6lYpG41cMYOyHpCRYwSrSxqn43X3CKzlT2MrhzKFx9jKvfD8fjSjf/2fYjmoQoNOVEqZbrxLqTEqkZ5TjKtROFMaED0sOWQUFCVJ10OujIPjWObweRNE9oe+r+7khJqNQw9ExlSHRfLWYT87+sleig1EmZiBONgs4+ChJu68iebG37TCLVfGiAUMnMrDbtE0moNrfJmSO4iysvQ/286F4UnapbKJdgpiwcwQmcgQuXUIYbqEANKCA8wjO8WHfWk/Vqvc1KM9a85xD+yHr/AY3RkSs=</latexit>
d
<latexit sha1_base64="FEFMRGFUIPT2A1P98ANRLhut344=">AAACBXicbVA9SwNBEN2LXzF+nVpqsRgEq3AniAELAzZWEoP5gFwMe5u9ZMne3rG7J4TjQGz8KzYWitha2tv5H0Rs7N1LUmjig4HHezPMzHNDRqWyrHcjMzM7N7+QXcwtLa+srpnrGzUZRAKTKg5YIBoukoRRTqqKKkYaoSDIdxmpu/2T1K9fESFpwC/UICQtH3U59ShGSkttc7sCHcqh4yPVc724klzGZ46iPpGwk7TNvFWwhoDTxB6T/PHX5/fH9etRuW2+OZ0ARz7hCjMkZdO2QtWKkVAUM5LknEiSEOE+6pKmphzpPa14+EUCd7XSgV4gdHEFh+rviRj5Ug58V3em18pJLxX/85qR8oqtmPIwUoTj0SIvYlAFMI0EdqggWLGBJggLqm+FuIcEwkoHl9Mh2JMvT5PafsE+KFjndr5UBCNkwRbYAXvABoegBE5BGVQBBjfgDjyAR+PWuDeejOdRa8YYz2yCPzBefgBuLJ29</latexit>
R 2 RN⇥d
<latexit sha1_base64="yIDoSqdvoiJSnjW/5xIPnv8D0ww=">AAAB9HicbVDLSsNAFL3xWesr6krcDFbBVUkEscuCmy4r9AVtCJPJpB06mcSZSaGEfkc3LhRx68e482t0+lho64GBwznncu+cIOVMacf5sjY2t7Z3dgt7xf2Dw6Nj++S0pZJMEtokCU9kJ8CKciZoUzPNaSeVFMcBp+1g+DDz2yMqFUtEQ49T6sW4L1jECNZG8mp+3uMmHWJ/OPHtklN25kDrxF2SUvXc/p5e6Ubdtz97YUKymApNOFaq6zqp9nIsNSOcToq9TNEUkyHu066hAsdUefn86Am6NkqIokSaJzSaq78nchwrNY4Dk4yxHqhVbyb+53UzHVW8nIk001SQxaIo40gnaNYACpmkRPOxIZhIZm5FZIAlJtr0VDQluKtfXiet27J7V3Ye3VK1AgsU4AIu4QZcuIcq1KAOTSDwBFN4gVdrZD1bb9b7IrphLWfO4A+sjx/BTJUE</latexit>
H k
<latexit sha1_base64="M7UJxrXa9rQ0ozD8JfmaZtBikVY=">AAACP3icbVDPSxtBGJ21tdWobdoevXw0CB407EpLpVAI6KEnUTBRyIRldnaSDM4vZmZLw2b/s176L3jr1YuHinjtrZNkD/56MPB473t8873MCO58HP+Jll68XH71emW1sba+8eZt8937ntOFpaxLtdD2PCOOCa5Y13Mv2LmxjMhMsLPs4mDmn/1g1nGtTv3EsIEkI8WHnBIfpLTZy7HjErDQIzj6CoA9FzkrD6u0VDuygm+Ap7WGzZinqoJduC/ICk/TPcDEGKt/wuEilzZbcTueA56SpCYtVOM4bV7iXNNCMuWpIM71k9j4QUms51SwqoELxwyhF2TE+oEqIpkblPP7K9gKSg5DbcNTHubq/URJpHMTmYVJSfzYPfZm4nNev/DD/UHJlSk8U3SxaFgI8BpmZULOLaNeTAIh1PLwV6BjYgn1ofJGKCF5fPJT0ttrJ5/b8cmnVme/rmMFbaKPaBsl6AvqoO/oGHURRb/QFfqLbqLf0XV0G90tRpeiOvMBPUD07z+te69a</latexit>
d ⇠ log N : D̃n,m = k ˜
n
˜
mk2 ⇡ Dn,m
<latexit sha1_base64="uo8ko41WZwk0KTDgGyOy/wdcmIA=">AAACFHicbVBNS8NAEN34bf2KehIvi1VQhJIIoheh4MWjSqtCU8NmM7VLN5uwOxFK6I/woj/FiwdFvHrw5q/RbevBrwcDj/dmdmdelElh0PPenZHRsfGJyanp0szs3PyCu7h0ZtJcc6jzVKb6ImIGpFBQR4ESLjINLIkknEedw75/fg3aiFTVsJtBM2FXSrQEZ2il0N0OUMgYiiBri1D16AHdPAqLQNoXYhZ2eqdbl7UgBoksFKFb9ireAPQv8b9IubriftytY+04dN+COOV5Agq5ZMY0fC/DZsE0Ci6hVwpyAxnjHXYFDUsVS8A0i8FRPbphlZi2Um1LIR2o3ycKlhjTTSLbmTBsm99eX/zPa+TY2m8WQmU5guLDj1q5pJjSfkI0Fho4yq4ljGthd6W8zTTjaHMs2RD83yf/JWc7FX+34p345eo+GWKKrJI1skl8skeq5Igckzrh5Ibck0fy5Nw6D86z8zJsHXG+ZpbJDzivnyyqoTE=</latexit>
˜
n = (H k
R)T
i
<latexit sha1_base64="oZ9+8Oq+F9d+NMa1aTxqVSN5Wdo=">AAAB+nicbVDLSgMxFM3UV62vqV26CZaCqzIjiF0W3AjdVLAP6Awlk2ba0GQyJBm1jN35G25cKOLWb/AD3OkH+AV+gOljoa0HLhzOuZd77wliRpV2nA8rs7K6tr6R3cxtbe/s7tn5/aYSicSkgQUTsh0gRRiNSENTzUg7lgTxgJFWMDyb+K0rIhUV0aUexcTnqB/RkGKkjdS18xx6KI6luIE1j4k+rHXtolN2poDLxJ2TYrVQuvt++/qsd+13rydwwkmkMUNKdVwn1n6KpKaYkXHOSxSJER6iPukYGiFOlJ9OTx/DklF6MBTSVKThVP09kSKu1IgHppMjPVCL3kT8z+skOqz4KY3iRJMIzxaFCYNawEkOsEclwZqNDEFYUnMrxAMkEdYmrZwJwV18eZk0j8vuSdm5cIvVCpghCw7AITgCLjgFVXAO6qABMLgG9+ARPFm31oP1bL3MWjPWfKYA/sB6/QEs5pe5</latexit>
m ⇡ K log K
<latexit sha1_base64="/pkZg139wkTKS2wSw3fmzSav9qo=">AAACAHicbVDLSsNAFJ3UV62vqAsXbgaL4KKERBS7EQpuBDcV7APSNExuJ+3QyYOZiVBKN/6KGxeKuPUz3Pk3TtsstPXAhcM593LvPUHKmVS2/W0UVlbX1jeKm6Wt7Z3dPXP/oCmTTABtQMIT0Q6IpJzFtKGY4rSdCkqigNNWMLyZ+q1HKiRL4gc1SqkXkX7MQgZEack3j6Ar8DV2wXe6omJZVgX8u67wfLNsW/YMeJk4OSmjHHXf/Or0EsgiGivgRErXsVPljYlQDDidlDqZpCmBIelTV9OYRFR649kDE3yqlR4OE6ErVnim/p4Yk0jKURTozoiogVz0puJ/npupsOqNWZxmisYwXxRmHKsET9PAPSYoKD7ShIBg+lYMAyIIKJ1ZSYfgLL68TJrnlnNp2fcX5Vo1j6OIjtEJOkMOukI1dIvqqIEATdAzekVvxpPxYrwbH/PWgpHPHKI/MD5/AJsklHU=</latexit>
cr
= [cr
1, ..., cr
K]
<latexit sha1_base64="57JLmAcWc6oPu/WM8bWUM5IzQBA=">AAACPnicbVBBaxNBGJ2N1tZo7VaPXgaDUCkNu0FpL4WCFw8qUbpJIJMs304mm0lnZpeZWSFs9pd58Td467EXD5bSa4/Opjlo64OBx3vfxzfvJbngxgbBudd48HDj0ebW4+aTp9vPdvzd5z2TFZqyiGYi04MEDBNcschyK9gg1wxkIlg/OXtf+/1vTBueqVO7yNlIQqr4lFOwTor9iFguJqykVTzHx5iATonkKi5pPCdcYSLBzpKk/FqNP1dk+cnJ+ICOtXOXcWfcwfuYpCAlYOeMT9O9j28cif1W0A5WwPdJuCYttEY39n+SSUYLyZSlAowZhkFuRyVoy6lgVZMUhuVAzyBlQ0cVSGZG5Sp+hV87ZYKnmXZPWbxS/94oQRqzkImbrNOYu14t/s8bFnZ6NCq5ygvLFL09NC0Ethmuu8QTrhm1YuEIUM3dXzGdgQZqXeNNV0J4N/J90uu0w3ft4Mvb1snRuo4t9BK9QnsoRIfoBH1AXRQhir6jC/QbXXo/vF/elXd9O9rw1jsv0D/wbv4AKKKuXQ==</latexit>
c̃j = arg min
cj 2RN
kMcj cr
j k2
2 + cT
j g(L)cj
Trembley et al., “Compressing spectral clustering”, ICML, 2016
Trembley et al., “Approximating spectral clustering by sampling: A review”, Sampling Techniques for Supervised
or Unsupervised Tasks, Springer 2020
44. C.2. Adaptive
fi
lters and per-node weighting in GNN
30
Tailor et al., “Adaptive
fi
lters and aggregator fusion for ef
fi
cient graph convolutions,” arXiv, 2021.
45. Adaptive
fi
lters and per-node weighting in GNN
31
• Operations can be computed with the standard compressed
sparse row (CSR) algorithm (SpMM
)
• Lower memory consumption (no message materialisation)
Graph attention network Ef
fi
cient graph convolution
• Requires only memory
<latexit sha1_base64="oYgfCvOlWOLs4yzAZMee0dSG8Fo=">AAAB63icbVBNSwMxEJ2tX7V+VT16CRahXsquKPZY8OJJK9gPaJeSTbNtaJJdkqxQlv4FLx4U8eof8ua/MdvuQVsfDDzem2FmXhBzpo3rfjuFtfWNza3idmlnd2//oHx41NZRoghtkYhHqhtgTTmTtGWY4bQbK4pFwGknmNxkfueJKs0i+WimMfUFHkkWMoJNJt1X784H5Ypbc+dAq8TLSQVyNAflr/4wIomg0hCOte55bmz8FCvDCKezUj/RNMZkgke0Z6nEgmo/nd86Q2dWGaIwUrakQXP190SKhdZTEdhOgc1YL3uZ+J/XS0xY91Mm48RQSRaLwoQjE6HscTRkihLDp5Zgopi9FZExVpgYG0/JhuAtv7xK2hc176rmPlxWGvU8jiKcwClUwYNraMAtNKEFBMbwDK/w5gjnxXl3PhatBSefOYY/cD5/AAe8jYo=</latexit>
O(N)
48. Take home message:
GSP for robustness and ef
fi
ciency
34
… for ML
GSP Tools …
Graph signal regularization
Graph
fi
ltering
Graph interpolation
Graph based transforms
Diffusion operators
Better embeddings
Higher classi
fi
cation accuracy in noisy settings
Less complex GNNs
Faster approximation of eigendecomposition
Stability with respect to topological noise
49. Outline
35
• Brief introduction to graph signal processing (GSP)
• Key GSP tools for machine learning
• Challenge I: GSP for exploiting data structure
• Challenge II: GSP for improving ef
fi
ciency and robustness
• Challenge III: GSP for enhancing model interpretability
• Applications
• Summary, open challenges, and new perspectives
50. Outline
35
• Brief introduction to graph signal processing (GSP)
• Key GSP tools for machine learning
• Challenge I: GSP for exploiting data structure
• Challenge II: GSP for improving ef
51. fi
c: Extract relevant knowledge from dat
a
- A.1. Signal analysis: Use GSP to reveal interpretable feature
s
- A.2. Structure inference: Use GSP to learn interpretable structures
36
B. Model speci
fi
c: Improve our understanding of ML model
s
- B.1. Understanding the expressive power of GNN
s
- B.2. A posteriori interpretation of DNN
52. GSP for enhancing interpretability
A. Domain speci
fi
c: Extract relevant knowledge from dat
a
- A.1. Signal analysis: Use GSP to reveal interpretable feature
s
- A.2. Structure inference: Use GSP to learn interpretable structures
36
B. Model speci
53. fi
c
knowledge discover
y
• In neuroscience, GSP tools have been used to improve our
understanding of the biological mechanisms underlying human
cognition and brain disorders
• Analysis in the spectral domain reveals the variation of signals on the
anatomical network
37
[Fig. from Huang’18]
54. A.1.1. GSP for understanding cognitive
fl
exibility
• Cognitive
fl
exibility describes the human ability to switch between
modes of mental functio
n
• Clarifying the nature of cognitive
fl
exibility is critical to understand the
human min
d
• GSP provides a framework for integrating brain network structure,
function, and cognitive measure
s
• It allows to decompose each BOLD signal into aligned and liberal
components
38
Medaglia et al., “Functional Alignment with Anatomical Networks is Associated with Cognitive Flexibility”, Nat.
Hum. Behav., 2018
55. BOLD signal alignment across the brain
• Functional alignment with anatomical networks facilitates cognitive
fl
exibility (lower switch costs
)
- Liberal signals are concentrated in subcortical regions and cingulate cortice
s
- Aligned signals are concentrated in subcortical, default mode, fronto-patietal,
and cingulo-opercular systems
39
BOLD signals
White matter
network
Decomposition of the signals into aligned
and liberal using GFT
56. A.1.2. Structural decoupling index
• Ratio of liberal versus aligned energy in a speci
fi
c nod
e
• Spatial organization of regions according to decoupling index reveals
behaviourally relevant gradient
40
Preti and Van De Ville., “Decoupling of brain function from structure reveals regional behavioral specialization in
humans”, Nat. Comm., 2019
Structurally coupled regions
:
Sensory regions
Structurally decoupled regions
:
Higher-level cognitive regions
57. A.1.3. Understanding ASD through GSP
• Predict Autism Spectrum Disorder (ASD) by exploiting structural and
functional informatio
n
• Discriminative patterns are extracted in the graph Fourier domai
n
• Frequency signatures are de
fi
ned as the variability over time of graph
Fourier modes
41
Itani and Thanou, “Combining anatomical and functional networks for neuropathology identi
fi
cation: A case study
on autism spectrum disorder”, Medical Image Analysis, 2021
58. Interpretable and discriminative features
• Neurotypical patients express a predominant activity in the parieto-
occipital region
s
• ASD patients express high level of activity in the pronto-temporal
areas
42
Neurotypical patients ASD patients
59. A.2. Structure inference
• Given observations on a number of variables, and some prior
knowledge (distribution, model, constraints), learn a measure of
relations between the
m
43
v1
v2
graph signal
?
60. A.2. Structure inference
• Given observations on a number of variables, and some prior
knowledge (distribution, model, constraints), learn a measure of
relations between the
m
43
v1
v2
graph signal
?
How to infer interpretable structure from data
?
61. GSP for structure inference
44
= X
D(G)
y x
given and , infer and x
G
y D
Dong et al., “Learning graphs from data: A signal representation perspective,” IEEE SPM, 2019
.
Mateos et al., “Connecting the dots: Identifying network structure via graph signal processing,” IEEE SPM, 2019.
62. GSP for structure inference
44
= X
D(G)
y x
given and , infer and x
G
y D
D: smoothness
Dong et al., “Learning graphs from data: A signal representation perspective,” IEEE SPM, 2019
.
Mateos et al., “Connecting the dots: Identifying network structure via graph signal processing,” IEEE SPM, 2019.
63. GSP for structure inference
44
= X
D(G)
y x
given and , infer and x
G
y D
D
D
: smoothness
: diffusion
Dong et al., “Learning graphs from data: A signal representation perspective,” IEEE SPM, 2019
.
Mateos et al., “Connecting the dots: Identifying network structure via graph signal processing,” IEEE SPM, 2019.
64. GSP for structure inference
44
= X
D(G)
y x
given and , infer and x
G
y D
D
D
: smoothness
: diffusion
! Examples of signal graph models:
! Smoothness:
! Diffusion:
D(G) =
D(G) = e ⌧L
Dong et al., “Learning graphs from data: A signal representation perspective,” IEEE SPM, 2019
.
Mateos et al., “Connecting the dots: Identifying network structure via graph signal processing,” IEEE SPM, 2019.
65. A.2.1 Imposing domain speci
fi
c priors
• Leads to more interpretable structures
- Genes are typically clustered into pathways
- Bipartite graph structure is more probable for drug discover
y
• Example of spatial and spectral constraints
:
- K-component grap
h
- Connected sparse grap
h
- K-connected d-regular grap
h
- Cospectral graph
45
Kumar et al., “Structured Graph Learning via Laplacian Spectral Constraints”, NeurIPS, 2019
<latexit sha1_base64="5SQcrObhR1p4QPsS58YZuXxHu+s=">AAACV3icbVFdS8MwFE27qXN+TX30JTgEQRntUPRFEHzxUdHpYJ0lvc1mXJqWJBVG6Z8UX/ZXfNG0TnGbFwLnnnNuPk6ChDOlHWdi2ZXq0vJKbbW+tr6xudXY3nlQcSqBdiDmsewGRFHOBO1opjntJpKSKOD0MRhdFfrjK5WKxeJejxPaj8hQsAEDog3lN4QXEf0MhGd3uZ953EyGJMcX2Mu8n9bPXgrG8Yzj5cLNn0bHGHwXe5ziX8voyM1LAsJYqxktKTvw22aDRtNpOWXhReBOQRNN68ZvvHlhDGlEhQZOlOq5TqL7GZGaAad53UsVTQiMyJD2DBQkoqqflbnk+MAwIR7E0iyhccn+nchIpNQ4CoyzSEHNawX5n9ZL9eC8nzGRpJoK+D5okHKsY1yEjEMmKWg+NoCAZOauGJ6JJKDNV9RNCO78kxfBQ7vlnrac25Pm5fk0jhraQ/voELnoDF2ia3SDOgjQO/qwKlbVmlif9rJd+7ba1nRmF82Uvf0FAXSzMg==</latexit>
S = {{ j = 0}k
j=1, c1 k+1 · · · p c2}
<latexit sha1_base64="qLJ+7C357BodzbGolRvXoNBLHTw=">AAACR3icbVBLS8NAEN7UV62vqEcvi0XwICUpir0IBS8eK9oHNDVsJtt2cfNgdyOUkH/nxas3/4IXD4p4dJu2YFsHFr7HDLPzeTFnUlnWm1FYWV1b3yhulra2d3b3zP2DlowSAbQJEY9ExyOSchbSpmKK004sKAk8Ttve4/XYbz9RIVkU3qtRTHsBGYSsz4AoLbnmgxMQNQTC07vMTR2uJ32S4SvszIhra2adYdDA4RTP5LSa5RT8SMk5J84ZuFUnc82yVbHywsvAnoIymlbDNV8dP4IkoKECTqTs2laseikRigGnWclJJI0JPJIB7WoYkoDKXprnkOETrfi4Hwn9QoVz9e9ESgIpR4GnO8dXy0VvLP7ndRPVr/VSFsaJoiFMFvUTjlWEx6FinwkKio80ICCY/iuGIREElI6+pEOwF09eBq1qxb6oWLfn5XptGkcRHaFjdIpsdInq6AY1UBMBekbv6BN9GS/Gh/Ft/ExaC8Z05hDNVcH4Bf/PsCA=</latexit>
S = { 1 = 0, c1 2 · · · p c2}
<latexit sha1_base64="PDA7+mb+hulcywb71ndgU8vf+14=">AAACcnicbVFdSxwxFM2Mtura1lXxpUKbdikoLstEFEUQhL70oQ+WdlXYbIfMncwaN/NBkhGWMD/Av+dbf0Vf/AFmpmOp2guBc8+5N/fmJCqk0CYIfnn+3PyLlwuLS53lV6/frHRX1850XirgQ8hlri4iprkUGR8aYSS/KBRnaST5eTT9XOvn11xpkWc/zKzg45RNMpEIYMZRYfeGpsxcApP2exVaKl1nzCp8jKmlD2lor2omoK7i6phUP6d9DCHBVHL8t2S6Q6qGgDg3+pFWNBmEu7TqY3qEY8EmeOvrtrszbsZHiSVV2O0Fg6AJ/ByQFvRQG6dh95bGOZQpzwxIpvWIBIUZW6aMAMmrDi01LxhM2YSPHMxYyvXYNpZV+JNjYpzkyp3M4Ib9t8OyVOtZGrnKekP9VKvJ/2mj0iSHYyuyojQ8gz+DklJik+Paf/d4xcHIWe0CKOF2xXDJFAPjfqnjTCBPn/wcnO0OyP4g+LbXOzls7VhEm+gj2kIEHaAT9AWdoiEC9Nvb8N557707/63/wW+98722Zx09Cr9/D5YOum4=</latexit>
S = {{ j = 0}k
j=1, c1 k+1 · · · p c2}, diag(L) = d1
<latexit sha1_base64="rBpnPWu+nEh6Q0Iug3Ikudc8YJc=">AAACPXicbVBNSwMxEM36WetX1aOXYBEUpOyKYhGEghePitYWmmWZzWY1mM0uSVYoy/4xL/4Hb968eFDEq1fTWvFzIPDmvRky74WZ4Nq47r0zNj4xOTVdmanOzs0vLNaWls91mivK2jQVqeqGoJngkrUNN4J1M8UgCQXrhFeHA71zzZTmqTwz/Yz5CVxIHnMKxlJB7YwkYC4piOK0DAoi7GYEJT7A5LMJuO3iDWK4iNgXWW5uYbKPSZwqEAJzTLjEPW8r80kZ1Opuwx0W/gu8EaijUR0HtTsSpTRPmDRUgNY9z82MX4AynApWVkmuWQb0Ci5Yz0IJCdN+MXRf4nXLRNjeYZ80eMh+3ygg0bqfhHZy4FX/1gbkf1ovN3HTL7jMcsMk/fgozgU2KR5EiSOuGDWibwFQxe2tmF6CAmps4FUbgvfb8l9wvt3wdhvuyU691RzFUUGraA1tIA/toRY6QseojSi6QQ/oCT07t86j8+K8foyOOaOdFfSjnLd34+qtsg==</latexit>
S = { i = f( ˜i), 8i 2 [1, p]}
66. Illustrative example
• Animals datase
t
• Imposing components leads to more semantically meaningful graphs
46
Graphical LASSO 1-component 5-component
67. A.2.2. Learning product graphs
• Cartesian product graphs are useful to explain complex relationships
in multi-domain graph data
47
Kadambari and Chepuri, “Product graph learning from multi-domain data with sparsity and rank constraints,” arXiv, 2020.
• learning graph factors with rank constraints
exact decomposition approximate decomposition
68. Multi-view object clustering
• COIL-20 dataset
:
- 10 objects Object graph of 10 nodes
- Rotation every 10 degrees: 36 views/image View graph with 36 nodes
48
Object graph and its connected components
View graph and its connected components
69. GSP for enhancing interpretability
A. Domain speci
fi
c: Extract relevant knowledge from dat
a
- A.1. Signal analysis: Use GSP to reveal interpretable feature
s
- A.2. Structure inference: Use GSP to learn interpretable structures
49
B. Model speci
fi
c: Improve our understanding of ML model
s
- B.1. Understanding the expressive power of GNN
s
- B.2. A posteriori interpretation of DNNs
70. GSP for enhancing interpretability
A. Domain speci
fi
c: Improve our understanding of ML model
s
- B.1. Understanding the expressive power of GNN
s
- B.2. A posteriori interpretation of DNNs
71. B.1. A GSP perspective on the expressive power of
GNNs
• A spectral analysis of GNNs provides a complementary point of view
to classical Weisfeiler-Lehman (WL) tes
t
• One step further in explaining GNN
s
• A common framework for spectral and spatial GNN
s
• The frequency pro
fi
le is de
fi
ned as:
50
<latexit sha1_base64="R+daTkd8jiY//nqGu65hJqkih0s=">AAACHnicbVDLSgMxFM3UV62vUZdugkVoUcqMWNSFUOimywr2AW0tmTRtQ5PMkGSEMsyXuPFX3LhQRHClf2Nm2oW2Xgj3cM653NzjBYwq7TjfVmZldW19I7uZ29re2d2z9w+ayg8lJg3sM1+2PaQIo4I0NNWMtANJEPcYaXmTaqK3HohU1Bd3ehqQHkcjQYcUI22ovl2u3UcFduoWY3gDu4qOOCp0Vcj7kYqrRlJGSC2mt5J+Zphi3847JSctuAzcOciDedX79md34OOQE6ExQ0p1XCfQvQhJTTEjca4bKhIgPEEj0jFQIE5UL0rPi+GJYQZw6EvzhIYp+3siQlypKfeMkyM9VotaQv6ndUI9vOpFVAShJgLPFg1DBrUPk6zggEqCNZsagLCk5q8Qj5FEWJtEcyYEd/HkZdA8L7nlknN7ka9cz+PIgiNwDArABZegAmqgDhoAg0fwDF7Bm/VkvVjv1sfMmrHmM4fgT1lfP2XFoCY=</latexit>
H(l+1)
= (
X
s
C(s)
H(l)
W(l,s)
)
Convolution support
<latexit sha1_base64="upYE1WIxiMm6FCFG4FVcxnBCmUw=">AAACEnicbVC7SgNBFL3rM8bXqqXNYBCSwrAriloIgTSWEbJJIC9mJ5NkyOzsMjMrhCXfYOOv2FgoYmtl5984m6TQxAMDh3Pu4c49fsSZ0o7zba2srq1vbGa2sts7u3v79sFhTYWxJNQjIQ9lw8eKciaop5nmtBFJigOf07o/Kqd+/YFKxUJR1eOItgM8EKzPCNZG6tqFVmXIuirf4ibTw4Vb1GN40EnO3AnKe51quZPkVWGCvELXzjlFZwq0TNw5ycEcla791eqFJA6o0IRjpZquE+l2gqVmhNNJthUrGmEywgPaNFTggKp2Mj1pgk6N0kP9UJonNJqqvxMJDpQaB76ZDLAeqkUvFf/zmrHuX7cTJqJYU0Fmi/oxRzpEaT/mfEmJ5uO0ByKZ+SsiQywx0abFrCnBXTx5mdTOi+5l0bm/yJVu5nVk4BhOIA8uXEEJ7qACHhB4hGd4hTfryXqx3q2P2eiKNc8cwR9Ynz+1SZuN</latexit>
s( ) = diag 1
(UT
C(s)
U)
Eigenvectors of the Laplacian
72. Frequency support of well known architectures
51
Balcilar et al., Analyzing the expressive power of graph neural networks in a spectral perspective, ICLR 2021
<latexit sha1_base64="upYE1WIxiMm6FCFG4FVcxnBCmUw=">AAACEnicbVC7SgNBFL3rM8bXqqXNYBCSwrAriloIgTSWEbJJIC9mJ5NkyOzsMjMrhCXfYOOv2FgoYmtl5984m6TQxAMDh3Pu4c49fsSZ0o7zba2srq1vbGa2sts7u3v79sFhTYWxJNQjIQ9lw8eKciaop5nmtBFJigOf07o/Kqd+/YFKxUJR1eOItgM8EKzPCNZG6tqFVmXIuirf4ibTw4Vb1GN40EnO3AnKe51quZPkVWGCvELXzjlFZwq0TNw5ycEcla791eqFJA6o0IRjpZquE+l2gqVmhNNJthUrGmEywgPaNFTggKp2Mj1pgk6N0kP9UJonNJqqvxMJDpQaB76ZDLAeqkUvFf/zmrHuX7cTJqJYU0Fmi/oxRzpEaT/mfEmJ5uO0ByKZ+SsiQywx0abFrCnBXTx5mdTOi+5l0bm/yJVu5nVk4BhOIA8uXEEJ7qACHhB4hGd4hTfryXqx3q2P2eiKNc8cwR9Ynz+1SZuN</latexit>
s( ) = diag 1
(UT
C(s)
U)
73. Frequency pro
fi
les of known GNNs
52
(i)
fi
rst 5 ChebNet supports (ii)
fi
rst 7 CayleyNet supports
(iii) GCN frequency pro
fi
les (iv) GIN on 1D
74. B.2. A posteriori interpretation of learning architectures
• Hypothesis: Identify relative change in model’s predictio
n
• Model Analysis and Reasoning using Graph-based Interpretability:
- Construct a domain (for interpretability) grap
h
- De
fi
ne an explanation function at the nodes of the grap
h
- Choose the in
fl
uential nodes by applying a high pass graph
fi
lterin
g
- Generate explanations by determining in
fl
uential nodes on the graph
53
Anirudh et al., MARGIN: Uncovering Deep Neural Networks Using Graph Signal Analysis, Frontiers in Big Data, 2021
<latexit sha1_base64="3yisCBUcwqPU4HOow+UOEhzHeNc=">AAAB63icbVBNSwMxEJ2tX7V+VT16CRahXsqutOix4EGPFewHtEvJptk2NMkuSVYoS/+CFw+KePUPefPfmG33oK0PBh7vzTAzL4g508Z1v53CxubW9k5xt7S3f3B4VD4+6egoUYS2ScQj1QuwppxJ2jbMcNqLFcUi4LQbTG8zv/tElWaRfDSzmPoCjyULGcEmk8Lq3eWwXHFr7gJonXg5qUCO1rD8NRhFJBFUGsKx1n3PjY2fYmUY4XReGiSaxphM8Zj2LZVYUO2ni1vn6MIqIxRGypY0aKH+nkix0HomAtspsJnoVS8T//P6iQlv/JTJODFUkuWiMOHIRCh7HI2YosTwmSWYKGZvRWSCFSbGxlOyIXirL6+TzlXNa9Tch3qlWc/jKMIZnEMVPLiGJtxDC9pAYALP8ApvjnBenHfnY9lacPKZU/gD5/MHHwaNlg==</latexit>
f(G)
<latexit sha1_base64="wCXBF8Xi0qMQSbLRPCiYupVJY+g=">AAAB9HicbVBNS8NAEJ3Ur1q/qh69LBahQimJVPQiVET0WMF+QBvKZrtpl242cXdTKKG/w4sHRbz6Y7z5b9y2OWjrg4HHezPMzPMizpS27W8rs7K6tr6R3cxtbe/s7uX3DxoqjCWhdRLyULY8rChngtY105y2Iklx4HHa9IY3U785olKxUDzqcUTdAPcF8xnB2kjuHbpCxUYJ3ZbQ9Wk3X7DL9gxomTgpKUCKWjf/1emFJA6o0IRjpdqOHWk3wVIzwukk14kVjTAZ4j5tGypwQJWbzI6eoBOj9JAfSlNCo5n6eyLBgVLjwDOdAdYDtehNxf+8dqz9SzdhIoo1FWS+yI850iGaJoB6TFKi+dgQTCQztyIywBITbXLKmRCcxZeXSeOs7JyX7YdKoVpJ48jCERxDERy4gCrcQw3qQOAJnuEV3qyR9WK9Wx/z1oyVzhzCH1ifP9Apj3s=</latexit>
G = (V, E, A)
<latexit sha1_base64="GDAsVCwNxsS2AI09VNnjJbDmqpg=">AAACDnicbVDLSsNAFJ3UV62vqEs3g6VQQUtSKroRKm50V8E+oIlhMp20QyeTMDMRStsvcOOvuHGhiFvX7vwbp20W2nrgwuGce7n3Hj9mVCrL+jYyS8srq2vZ9dzG5tb2jrm715BRIjCp44hFouUjSRjlpK6oYqQVC4JCn5Gm37+a+M0HIiSN+J0axMQNUZfTgGKktOSZhZsiPYIX0BkFJ/AycEZe+b58DJ0gEogxSKFDOWx4Zt4qWVPARWKnJA9S1Dzzy+lEOAkJV5ghKdu2FSt3iISimJFxzkkkiRHuoy5pa8pRSKQ7nL4zhgWtdKC+QBdXcKr+nhiiUMpB6OvOEKmenPcm4n9eO1HBuTukPE4U4Xi2KEgYVBGcZAM7VBCs2EAThAXVt0LcQwJhpRPM6RDs+ZcXSaNcsk9L1m0lX62kcWTBATgERWCDM1AF16AG6gCDR/AMXsGb8WS8GO/Gx6w1Y6Qz++APjM8fgLCZOA==</latexit>
I(i) = kf Afk2
2, 8i 2 V
75. Explanations for image classi
fi
cation (I)
• Nodes of the graph: superpixels from image
s
• Graph edges: relative importance of each superpixel
• Explanation: ratio between size of superpixel corresponding to the
node and the size of the largest superpixel
54
77. Interpreting decision boundaries
• Use MARGIN to identify samples that are likely to be misclassi
fi
e
d
- Nodes: embeddings of each sample
- Edges: similarity between embedding
s
- Explanation function: local label agreement
56
78. Take home message:
GSP for interpretability
• Interpreting the dat
a
- Graph-based transforms reveal new and interpretable features
- Integrating them into a machine learning framework leads to more accurate and
interpretable model
s
- Imposing application-related constraints in topology inference algorithms
generates interpretable structure
s
• Interpreting the models
- Analysing the spectral behaviour of GNNs provides insights on their expressive
powe
r
- GSP operators contribute towards the a posteriori interpretation of learning
architectures
57
79. Summary
• GSP has shown promising results towards improving different aspects
of classical ML algorithm
s
- Robustness to noisy and limited dat
a
- Robustness to topological nois
e
- Data and model interpretabilit
y
• Presented works are indicative only and non-exhaustiv
e
• Plenty of room for exciting research!
58