Ivan Sahumbaiev "Deep Learning approaches meet 3D data"Fwdays
During this talk, I’d be talking about how 3d data can be processed with Deep Learning models. The main focus would be on Point Clouds.
Session agenda:
What are 3D data and its representation
Overview of libraries to visualize and process
How to collect. Cameras. Calibration
The current state of the art for point cloud processing with Deep Learning models.
classification problem. Models to use
segmentation problem. Models to use
datasets. Losses and training routine
Point clouds correspondences
spectral methods to generate correspondences
Limitations.
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLuba Elliott
This talk by Lucas Theis from Twitter/Magic Pony on "Compressing Images with Neural Networks" was presented at the Learning Image Representations event on 30th August at Twitter as part of the Creative AI meetup.
Multiple patterning is a class of technologies for manufacturing integrated circuits (ICs), developed for photolithography to enhance the feature density. The simplest case of multiple patterning is double patterning, where a conventional lithography process is enhanced to produce double the expected number of features. The resolution of a photoresist pattern is believed to blur at around 45 nm half-pitch. For the semiconductor industry, therefore, double patterning was introduced for the 32 nm half-pitch node and below. This presentation gives us an insight of why multiple patterning is an important to give us a better resolution below 32nm.
Euro30 2019 - Benchmarking tree approaches on street dataFabion Kauker
By examining the use of algorithms to solve the Prize Collecting Steiner Tree (PCST) problem we consider the facets which determine effectiveness. Specifically, by measuring a number of solution approaches and comparing them based on metrics. In order to understand the solution approach we must asses why it is useful. Our goal is to determine the effectiveness of Mixed Integer Programming (MIP) and heuristic methods. Utilizing freely available street and address data a base graph representation is created and then computed on. Such that a tree connects every address utilizing the minimum total length of edges from the street network. This is the basis of many approaches used to solve infrastructure problems including telecommunications network design and costing. The analysis is conducted on methods developed by Hegde et al. 2015, Ljubić et al. 2006, and Teitz et al. 1963. We present a data processing architecture, as well as a concise set of results and a framework for assessing the facets and trade-offs for a given approach. In this case the heuristic approaches are proven to have advantages in the simplistic case but fail when more complex requirements are added. This is where the MIP approach is able to capitalize, whilst detrimentally limiting the flexibility due to the strictness and specificity in modelling.
Ivan Sahumbaiev "Deep Learning approaches meet 3D data"Fwdays
During this talk, I’d be talking about how 3d data can be processed with Deep Learning models. The main focus would be on Point Clouds.
Session agenda:
What are 3D data and its representation
Overview of libraries to visualize and process
How to collect. Cameras. Calibration
The current state of the art for point cloud processing with Deep Learning models.
classification problem. Models to use
segmentation problem. Models to use
datasets. Losses and training routine
Point clouds correspondences
spectral methods to generate correspondences
Limitations.
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLuba Elliott
This talk by Lucas Theis from Twitter/Magic Pony on "Compressing Images with Neural Networks" was presented at the Learning Image Representations event on 30th August at Twitter as part of the Creative AI meetup.
Multiple patterning is a class of technologies for manufacturing integrated circuits (ICs), developed for photolithography to enhance the feature density. The simplest case of multiple patterning is double patterning, where a conventional lithography process is enhanced to produce double the expected number of features. The resolution of a photoresist pattern is believed to blur at around 45 nm half-pitch. For the semiconductor industry, therefore, double patterning was introduced for the 32 nm half-pitch node and below. This presentation gives us an insight of why multiple patterning is an important to give us a better resolution below 32nm.
Euro30 2019 - Benchmarking tree approaches on street dataFabion Kauker
By examining the use of algorithms to solve the Prize Collecting Steiner Tree (PCST) problem we consider the facets which determine effectiveness. Specifically, by measuring a number of solution approaches and comparing them based on metrics. In order to understand the solution approach we must asses why it is useful. Our goal is to determine the effectiveness of Mixed Integer Programming (MIP) and heuristic methods. Utilizing freely available street and address data a base graph representation is created and then computed on. Such that a tree connects every address utilizing the minimum total length of edges from the street network. This is the basis of many approaches used to solve infrastructure problems including telecommunications network design and costing. The analysis is conducted on methods developed by Hegde et al. 2015, Ljubić et al. 2006, and Teitz et al. 1963. We present a data processing architecture, as well as a concise set of results and a framework for assessing the facets and trade-offs for a given approach. In this case the heuristic approaches are proven to have advantages in the simplistic case but fail when more complex requirements are added. This is where the MIP approach is able to capitalize, whilst detrimentally limiting the flexibility due to the strictness and specificity in modelling.
We review our recent progress in the development of graph kernels. We discuss the hash graph kernel framework, which makes the computation of kernels for graphs with vertices and edges annotated with real-valued information feasible for large data sets. Moreover, we summarize our general investigation of the benefits of explicit graph feature maps in comparison to using the kernel trick. Our experimental studies on real-world data sets suggest that explicit feature maps often provide sufficient classification accuracy while being computed more efficiently. Finally, we describe how to construct valid kernels from optimal assignments to obtain new expressive graph kernels. These make use of the kernel trick to establish one-to-one correspondences. We conclude by a discussion of our results and their implication for the future development of graph kernels.
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs Christopher Morris
Most state-of-the-art graph kernels only take local graph properties into account, i.e., the kernel is computed with regard to properties of the neighborhood of vertices or other small substructures. On the other hand, kernels that do take global graph properties into account may not scale well to large graph databases. Here we propose to start exploring the space between local and global graph kernels, striking the balance between both worlds. Specifically, we introduce a novel graph kernel based on the k-dimensional Weisfeiler-Lehman algorithm. Unfortunately, the k-dimensional Weisfeiler-Lehman algorithm scales exponentially in k. Consequently, we devise a stochastic version of the kernel with provable approximation guarantees using conditional Rademacher averages. On bounded-degree graphs, it can even be computed in constant time. We support our theoretical results with experiments on several graph classification benchmarks, showing that our kernels often outperform the state-of-the-art in terms of classification accuracies.
Visualization of multidimensional multi factorial big data is not large data, big data is complex data.We are trainnig decipher this complexcity data Visualization.
Data Visualization packages of R software lattice and ggplot 2.
Graphical Data-Mining Analysis With R Software
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector QuantizationSunghoon Joo
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
[Guo et al., ICML 2020]
Paper link: https://arxiv.org/abs/1908.10396
Video presentation link: https://youtu.be/cU46yR-A0cs
reviewed by Sunghoon Joo
Weakly supervised semantic segmentation of 3D point cloudArithmer Inc.
Slide for study session given by Dr. Daisuke Sato at Arithmer inc.
It is a summary of methods for semantic segmentation for 3D pointcloud using 2D weakly-supervised learning.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...MLAI2
Many practical graph problems, such as knowledge graph construction and drug-drug interaction prediction, require to handle multi-relational graphs. However, handling real-world multi-relational graphs with Graph Neural Networks (GNNs) is often challenging due to their evolving nature, as new entities (nodes) can emerge over time. Moreover, newly emerged entities often have few links, which makes the learning even more difficult. Motivated by this challenge, we introduce a realistic problem of few-shot out-of-graph link prediction, where we not only predict the links between the seen and unseen nodes as in a conventional out-of-knowledge link prediction task but also between the unseen nodes, with only few edges per node. We tackle this problem with a novel transductive meta-learning framework which we refer to as Graph Extrapolation Networks (GEN). GEN meta-learns both the node embedding network for inductive inference (seen-to-unseen) and the link prediction network for transductive inference (unseen-to-unseen). For transductive link prediction, we further propose a stochastic embedding layer to model uncertainty in the link prediction between unseen entities. We validate our model on multiple benchmark datasets for knowledge graph completion and drug-drug interaction prediction. The results show that our model significantly outperforms relevant baselines for out-of-graph link prediction tasks.
Map-Side Merge Joins for Scalable SPARQL BGP ProcessingAlexander Schätzle
In recent times, it has been widely recognized that, due to their inherent scalability, frameworks based on MapReduce are indispensable for so-called "Big Data" applications. However, for Semantic Web applications using SPARQL, there is still a demand for sophisticated MapReduce join techniques for processing basic graph patterns, which are at the core of SPARQL. Renowned for their stable and efficient performance, sort-merge joins have become widely used in DBMSs. In this paper, we demonstrate the adaptation of merge joins for SPARQL BGP processing with MapReduce. Our technique supports both n-way joins and sequences of join operations by applying merge joins within the map phase of MapReduce while the reduce phase is only used to fulfill the preconditions of a subsequent join iteration.
Our experiments with the LUBM benchmark show an average performance benefit between 15% and 48% compared to other MapReduce based approaches while at the same time scaling linearly with the RDF dataset size.
We present Graph Convolutional Networks that, unlike classic DL models, allow supervised learning by exploiting both the single node features and its relationships with the others within the network.
We review our recent progress in the development of graph kernels. We discuss the hash graph kernel framework, which makes the computation of kernels for graphs with vertices and edges annotated with real-valued information feasible for large data sets. Moreover, we summarize our general investigation of the benefits of explicit graph feature maps in comparison to using the kernel trick. Our experimental studies on real-world data sets suggest that explicit feature maps often provide sufficient classification accuracy while being computed more efficiently. Finally, we describe how to construct valid kernels from optimal assignments to obtain new expressive graph kernels. These make use of the kernel trick to establish one-to-one correspondences. We conclude by a discussion of our results and their implication for the future development of graph kernels.
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs Christopher Morris
Most state-of-the-art graph kernels only take local graph properties into account, i.e., the kernel is computed with regard to properties of the neighborhood of vertices or other small substructures. On the other hand, kernels that do take global graph properties into account may not scale well to large graph databases. Here we propose to start exploring the space between local and global graph kernels, striking the balance between both worlds. Specifically, we introduce a novel graph kernel based on the k-dimensional Weisfeiler-Lehman algorithm. Unfortunately, the k-dimensional Weisfeiler-Lehman algorithm scales exponentially in k. Consequently, we devise a stochastic version of the kernel with provable approximation guarantees using conditional Rademacher averages. On bounded-degree graphs, it can even be computed in constant time. We support our theoretical results with experiments on several graph classification benchmarks, showing that our kernels often outperform the state-of-the-art in terms of classification accuracies.
Visualization of multidimensional multi factorial big data is not large data, big data is complex data.We are trainnig decipher this complexcity data Visualization.
Data Visualization packages of R software lattice and ggplot 2.
Graphical Data-Mining Analysis With R Software
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector QuantizationSunghoon Joo
PR-272: Accelerating Large-Scale Inference with Anisotropic Vector Quantization
[Guo et al., ICML 2020]
Paper link: https://arxiv.org/abs/1908.10396
Video presentation link: https://youtu.be/cU46yR-A0cs
reviewed by Sunghoon Joo
Weakly supervised semantic segmentation of 3D point cloudArithmer Inc.
Slide for study session given by Dr. Daisuke Sato at Arithmer inc.
It is a summary of methods for semantic segmentation for 3D pointcloud using 2D weakly-supervised learning.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Pr...MLAI2
Many practical graph problems, such as knowledge graph construction and drug-drug interaction prediction, require to handle multi-relational graphs. However, handling real-world multi-relational graphs with Graph Neural Networks (GNNs) is often challenging due to their evolving nature, as new entities (nodes) can emerge over time. Moreover, newly emerged entities often have few links, which makes the learning even more difficult. Motivated by this challenge, we introduce a realistic problem of few-shot out-of-graph link prediction, where we not only predict the links between the seen and unseen nodes as in a conventional out-of-knowledge link prediction task but also between the unseen nodes, with only few edges per node. We tackle this problem with a novel transductive meta-learning framework which we refer to as Graph Extrapolation Networks (GEN). GEN meta-learns both the node embedding network for inductive inference (seen-to-unseen) and the link prediction network for transductive inference (unseen-to-unseen). For transductive link prediction, we further propose a stochastic embedding layer to model uncertainty in the link prediction between unseen entities. We validate our model on multiple benchmark datasets for knowledge graph completion and drug-drug interaction prediction. The results show that our model significantly outperforms relevant baselines for out-of-graph link prediction tasks.
Map-Side Merge Joins for Scalable SPARQL BGP ProcessingAlexander Schätzle
In recent times, it has been widely recognized that, due to their inherent scalability, frameworks based on MapReduce are indispensable for so-called "Big Data" applications. However, for Semantic Web applications using SPARQL, there is still a demand for sophisticated MapReduce join techniques for processing basic graph patterns, which are at the core of SPARQL. Renowned for their stable and efficient performance, sort-merge joins have become widely used in DBMSs. In this paper, we demonstrate the adaptation of merge joins for SPARQL BGP processing with MapReduce. Our technique supports both n-way joins and sequences of join operations by applying merge joins within the map phase of MapReduce while the reduce phase is only used to fulfill the preconditions of a subsequent join iteration.
Our experiments with the LUBM benchmark show an average performance benefit between 15% and 48% compared to other MapReduce based approaches while at the same time scaling linearly with the RDF dataset size.
We present Graph Convolutional Networks that, unlike classic DL models, allow supervised learning by exploiting both the single node features and its relationships with the others within the network.
How can we apply machine learning techniques on graphs to obtain predictions in a variety of domains? Know more from Sami Abu-El-Haija, an AI Scientist with experience from both industry (Google Research) and academia (University of Southern California).
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...Artem Lutov
Slides of the presentation given at BigData'19, special session on Information Granulation in Data Science and Scalable Computing.
The fully automatic (i.e., without any manual tuning) graph embedding (i.e., network representation learning, unsupervised feature extraction) performed in near-linear time is presented. The resulting embeddings are interpretable, preserve both low- and high-order structural proximity of the graph nodes, computed (i.e., learned) by orders of magnitude faster and perform competitively to the manually tuned best state-of-the-art embedding techniques evaluated on diverse tasks of graph analysis.
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
Cluster analysis of graph related problems is an important issue now-a-day. Different types of graph
clustering techniques are appeared in the field but most of them are vulnerable in terms of effectiveness
and fragmentation of output in case of real-world applications in diverse systems. In this paper, we will
provide a comparative behavioural analysis of RNSC (Restricted Neighbourhood Search Clustering) and
MCL (Markov Clustering) algorithms on Power-Law Distribution graphs. RNSC is a graph clustering
technique using stochastic local search. RNSC algorithm tries to achieve optimal cost clustering by
assigning some cost functions to the set of clusterings of a graph. This algorithm was implemented by A.
D. King only for undirected and unweighted random graphs. Another popular graph clustering
algorithm MCL is based on stochastic flow simulation model for weighted graphs. There are plentiful
applications of power-law or scale-free graphs in nature and society. Scale-free topology is stochastic i.e.
nodes are connected in a random manner. Complex network topologies like World Wide Web, the web of
human sexual contacts, or the chemical network of a cell etc., are basically following power-law
distribution to represent different real-life systems. This paper uses real large-scale power-law
distribution graphs to conduct the performance analysis of RNSC behaviour compared with Markov
clustering (MCL) algorithm. Extensive experimental results on several synthetic and real power-law
distribution datasets reveal the effectiveness of our approach to comparative performance measure of
these algorithms on the basis of cost of clustering, cluster size, modularity index of clustering results and
normalized mutual information (NMI).
Attentive Relational Networks for Mapping Images to Scene GraphsSangmin Woo
M. Qi, W. Li, Z. Yang, Y. Wang, and J. Luo.: Attentive relational networks for mapping images to scene graphs. In The
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
I studied in Indian Institute of Technology, Kharagpur, India. I did my B.Texh and M.Tech in the department of Electronics and Electrical Communication Engineering. I was student of 2018 batch. After that, I joined Schneider Electric Systems India Private limited Company as Software design Engineer. Currently I am designated as Senior Firmware Engineer in the same company. I have work experience of 4+ years. The uploaded ppt is my MTP Thesis. It is about "temperature aware application mapping on to mesh based network on chip using Genetic Algorithm".
Introduction to Graph neural networks @ Vienna Deep Learning meetupLiad Magen
Graphs are useful data structures that can be used to model various sorts of data: from molecular protein structures to social networks, pandemic spreading models, and visually rich content such as websites & invoices. In the recent few years, graph neural networks have done a huge leap forward. It is a powerful tool that every data scientist should know. In this talk, we will review their basic structure, show some example usages, and explore the existing (python) tools.
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
Graph Representation Learning with Deep Embedding Approach:
Graphs are commonly used data structure for representing the real-world relationships, e.g., molecular structure, knowledge graphs, social and communication networks. The effective encoding of graphical information is essential to the success of such applications. In this talk I’ll first describe a general deep learning framework, namely structure2vec, for end to end graph feature representation learning. Then I’ll present the direct application of this model on graph problems on different scales, including community detection and molecule graph classification/regression. We then extend the embedding idea to temporal evolving user-product interaction graph for recommendation. Finally I’ll present our latest work on leveraging the reinforcement learning technique for graph combinatorial optimization, including vertex cover problem for social influence maximization and traveling salesman problem for scheduling management.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
2. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Outline
● Graph Convolutional Networks
(GCN) for Node Classification
● Motivations of Training GCN In
Graph Database
● Demo: Paper Classification in a
Citation Graph
3. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Paper Classification
Neural Network
We examine the
quantum confinement
in the photoemission
ionization energy in
air and optical band
gap of carbon
nanoparticles (CNPs)
...
air 0
aromatic 0
band gap 1
carbon 1
nanoparticle 1
orbital 0
Abstract Term
Bag of words vector
Topic
C.Liu, PNAS, (2019)
Phys 0.9
Bio 0.1
CS 0
Econ 0
Reference
[1] ...
[2] …
9. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Pros & Cons
● Semi-supervised approach
● High accuracy can be achieved
with a low labeling rate
● Prediction requires graph
traversal
● Size of A and X scale with number
of edges and vertices
10. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Traditional Model Training Pipeline
training data
model
Database:
● Paper contents table
● Citation relation table
● data update
● preprocess data
Machine learning platform:
● Build feature matrix X
● Build adjacency matrix A
● model training
● model validation
11. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
In-Database Model Training
training data
model
Database:
● Citation graph
● data update
● preprocess data
● Model training
Machine learning platform:
● Build adjacency matrix A
● Build feature matrix X
● model training
● model validation
● The adjacency matrix is stored as a
graph in the database.
● Prediction and training can be
done by running queries.
● Better support continuous model
training over evolving data
● Support distributed model training
12. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Distributed Model Training in Graph Database
ŷ(2)
ŷ(1)
a(1,3)
a (1,4)
a(1,2)
a(2,5)
x(1)
x(2)
x(3)
x(4)
x(5)
W(0)
, W(1)
● Each vertex collects the features
from its neighbors and combines
them with its own feature to form
z(1)
13. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Distributed Model Training in Graph Database
ŷ(2)
ŷ(1)
a(1,3)
a (1,4)
a(1,2)
a(2,5)
z(1)
z(2)
z(3)
z(4)
z(5)
W(0)
, W(1)
● Each vertex collects the features
from its neighbors and combines
them with its own feature to form
z(1)
● Propagate z(1)
through W(0)
to the
hidden layer
14. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Distributed Model Training in Graph Database
ŷ(2)
ŷ(1)
a(1,3)
a (1,4)
a(1,2)
a(2,5)
𝜎
(1)
=ReLU(z(1)
W(0)
)
𝜎(2)
𝜎(3)
𝜎(4)
𝜎(5)
W(0)
, W(1)
● Compute the activation on the
hidden layer, 𝜎, using ReLU
function
15. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Distributed Model Training in Graph Database
ŷ(2)
ŷ(1)
a(1,3)
a (1,4)
a(1,2)
a(2,5)
𝜎(1)
𝜎(2)
𝜎(3)
𝜎(4)
𝜎(5)
W(0)
, W(1)
● Compute the activation on the
hidden layer, 𝜎, using ReLU
function
● Repeat the first two steps to
compute the output layer
16. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Distributed Model Training in Graph Database
ŷ(2)
ŷ(1)
a(1,3)
a (1,4)
a(1,2)
a(2,5)
y(1)
y(2)
y(3)
y(4)
y(5)
W(0)
, W(1)
● Compute the activation on the
hidden layer, 𝜎, using ReLU
function
● Repeat the first two steps to
compute the output layer
● Compute the prediction using
softmax function
17. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Distributed Model Training in Graph Database
ŷ(2)
ŷ(1)
a(1,3)
a (1,4)
a(1,2)
a(2,5)
𝛿(1)
= ŷ(1)
- y(1)
𝛿(2)
= ŷ(2)
-
y(2)
y(3)
y(4)
y(5)
𝜕J/ 𝜕W(0)
W(0)
=W(0)
- ⍺ ( 𝜕J/ 𝜕W(0)
)
● Aggregate the prediction error
𝛿 and use gradient descent to
update the weight matrix.
18. | GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Demo (GCN for Node Classification)
Data set
•Cora, Citation network (undirected)
•2708 nodes, 5,429 edges, 7 classes
•Sparse bag-of-words feature vectors (dim:1433)
Model
•Y=softmax(AReLU(AXW(0)
) W(1)
)
•One hidden layer with 16 hidden features
•Training: 140, validation: 500, testing: 1000
•Loss: softmax_cross_entropy_with_logits
•Batch gradient descent
•Dropout: 0.5, L2 regularization (5e-4) for first layer
Data: Sen et al., AI magazine (2008)
Model: Thomas N. Kipf and Max Welling, ICLR (2017)