SlideShare a Scribd company logo
A GENTLE TUTORIAL ON GRAPH
NEURAL NETWORKS AND ITS
APPLICATION TO PROGRAMMING
LANGUAGE
Yizhou Sun
Department of Computer Science
University of California, Los Angeles
yzsun@cs.ucla.edu
August 29, 2023
Outline
• Introduction
• Graph Neural Networks
• Downstream Tasks for Graphs
• Applications in PL
• Discussions
1
Graph Analysis
•Graphs are ubiquitous
• Social networks
• Proteins
• Chemical compounds
• Program dependence graph
• ...
2
Representing Nodes and Graphs
•Important for many graph related tasks
•Discrete nature makes it very challenging
•Naïve solutions
Limitations:
Extremely High-dimensional
No global structure information integrated
Permutation-variant
Even more challenging for graph representation
•Ex. Graphlet-based feature vector
4
Source:
https://haotang1995.github.io/projects/robust
_graph_level_representation_learning_using_g
raph_based_structural_attentional_learning
Source: DOI: 10.1093/bioinformatics/btv130
Requires subgraph isomorphism test: NP-hard
Automatic representation Learning
•Map each node/graph into
a low dimensional vector
• 𝜙: 𝑉 → 𝑅𝑑
or 𝜙: 𝒢 → 𝑅𝑑
•Earlier methods
• Shallow node embedding
methods inspired by
word2vec
• DeepWalk [Perozzi, KDD’14]
• LINE [Tang, WWW’15]
• Node2Vec [Grover, KDD’16]
5
Source: DeepWalk
𝝓 𝒗 = 𝑼𝑻
𝒙𝒗, where U is the embedding matrix and 𝒙𝒗
is the one-hot encoding vector
Limitation of shallow embedding techniques
•Too many parameters
• Each node is associated with an embedding
vector, which are parameters
•Not inductive
• Cannot handle new nodes
•Cannot handle node attributes
6
From shallow embedding to Graph Neural
Networks
•The embedding function (encoder) is more
complicated
• Shallow embedding
• 𝜙 𝑣 = 𝑈𝑇𝑥𝑣, where U is the embedding matrix and
𝑥𝑣 is the one-hot encoding vector
• Graph neural networks
• 𝜙 𝑣 is a neural network depending on the graph
structure
7
Outline
• Introduction
• Graph Neural Networks
• Downstream Tasks for Graphs
• Applications in PL
• Discussions
8
Notations
•An attributed graph 𝐺 = (𝑉, 𝐸)
• 𝑉: vertex set
• 𝐸: edge set
• 𝐴: adjacency matrix
• 𝑋 ∈ 𝑅𝑑0×|𝑉|
: feature matrix for all the nodes
• 𝑁(𝑣): neighbors of node 𝑣
• ℎ𝑣
𝑙
: Representation vector of node 𝑣 at Layer 𝑙
• Note ℎ𝑣
0 = 𝑥𝑣
• 𝐻𝑙
∈ 𝑅𝑑𝑙×|𝑉|
: representation matrix
9
The General Architecture of GNNs
•For a node v at layer t
• A function of representations of neighbors and
itself from previous layers
• Aggregation of neighbors
• Transformation to a different space
• Combination of neighbors and the node itself
10
representation vector
from previous layer for
node v
representation vectors
from previous layer for
node v’s neighbors
Compare with CNN
•Recall CNN
• Regular graph
•GNN
• Extend to irregular graph structure
11
Graph Convolutional Network (GCN)
12
•Kipf and Welling, ICLR’17
• , 𝐴 = 𝐴 + 𝐼
• f: graph filter
•From a node v’s perspective
𝑾𝒌: 𝒘𝒆𝒊𝒈𝒉𝒕 𝒎𝒂𝒕𝒓𝒊𝒙 𝒂𝒕 𝑳𝒂𝒚𝒆𝒓 𝒌, 𝒔𝒉𝒂𝒓𝒆𝒅 𝒂𝒄𝒓𝒐𝒔𝒔 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒕 𝒏𝒐𝒅𝒆𝒔
A toy example of 2-layer GCN on a 4-node
graph
•Computation graph
13
13
GraphSAGE
• Inductive Representation Learning on Large Graphs
William L. Hamilton*, Rex Ying*, Jure Leskovec,
NeurIPS’17
14
A more general form
More about AGG
•Mean
•LSTM
• 𝜋 ⋅ : 𝑎 𝑟𝑎𝑛𝑑𝑜𝑚 𝑝𝑒𝑟𝑚𝑢𝑡𝑎𝑡𝑖𝑜𝑛
•Pool
• 𝛾 ⋅ : Element-wise mean/max pooling of
neighbor set
15
= 𝛾
Message-Passing Neural Network
• Gilmer et al., 2017. Neural Message Passing for
Quantum Chemistry. ICML.
• A general framework that subsumes most GNNs
• Can also include edge information
• Two steps
• Get messages from neighbors at step k
• Update the node latent represent based on the msg
16
e.g., Sum or MLP
e.g., LSTM, GRU
𝑨 𝒔𝒑𝒆𝒄𝒊𝒂𝒍 𝒄𝒂𝒔𝒆: 𝑮𝑮𝑵𝑵, Li et al., Gated graph sequence neural networks, ICLR 2015
Graph Attention Network (GAN)
•How to decide the importance of neighbors?
• GCN: a predefined weight
• Others: no differentiation
•GAN: decide the weights using learnable
attention
• Velickovic et al., 2018. Graph Attention
Networks. ICLR.
17
The attention mechanism
•Potentially many possible designs
18
Heterogeneous Graph Transformer (HGT)
•How to handle heterogeneous types of
nodes and relations?
• Introduce different weight matrices for different
types of nodes and relations
• Introduce different attention weight matrices for
different types of nodes and relations
19
𝑡
𝑠2
𝑠1
Paper
Author Paper
Cite
Write
Hu et al., Heterogeneous Graph Transformer, WWW’20
Meta-Relation-based Parametrization
•Introduce node- and edge- dependent
parameterization
• Leverage meta relation <source node type, edge
type, target node type> to parameterize attention
and message passing weight.
20
W<Author, Write, Paper>
=WAuthor WWrite WPaper
W<Paper, Cite, Paper>
=WPaper WCite WPaper
𝑠2
𝑠1
Author Paper
Cite
Write
𝑡
Meta-Relation-based Attention
•Attention learning is also parameterized
based on node type and link type
21
Outline
• Introduction
• Graph Neural Networks
• Downstream Tasks for Graphs
• Applications in PL
• Discussions
22
Typical Graph Functions
•Node level
• Similarity search
• Link prediction
• Classification
• Community detection
• Ranking
23
•Graph level
• Similarity search
• Frequent pattern mining
• Graph isomorphism test
• Graph matching
• Classification
• Clustering
• Graph generation
1. Semi-supervised Node Classification
•Decoder using 𝑧𝑣 = ℎ𝑣
𝐿
• Feed into another fully connected layer
• 𝑦𝑣 = 𝜎(𝜃𝑇
𝑧𝑣)
•Loss function
• Cross entropy loss
• In a binary classification case
• 𝑙𝑣 = 𝑦𝑣𝑙𝑜𝑔𝑦𝑣 + (1 − 𝑦𝑣)log(1 − 𝑦𝑣)
24
Applications of Node Classification
•Social network
• An account is bot or not
•Citation network
• A paper’s research field
•A program-derived graph
• The type of a variable
25
2. Link Prediction
•Decoder using 𝑧𝑣 = ℎ𝑣
𝐿
• Given a node pair 𝑢, 𝑣
• Determine its probability 𝑝𝑢𝑣 = 𝑧𝑢
𝑇
𝑅𝑧𝑣
• R could be different for different relation type
•Loss function
• Cross entropy loss
• 𝑙𝑢𝑣 = 𝑦𝑢𝑣𝑙𝑜𝑔𝑝𝑢𝑣 + (1 − 𝑦𝑢𝑣)log(1 − 𝑝𝑢𝑣)
26
Link Prediction Applications
•Social network
• Friend recommendation
•Citation network
• Citation recommendation
•Medical network
• Drug and target binding or not
•A program-derived graph
• Code autocomplete
27
3. Graph Classification
•Decoder using ℎ𝐺 = 𝑔( 𝑧𝑣 𝑣∈𝑉)
• 𝑔 ⋅ : 𝑎 𝑟𝑒𝑎𝑑 𝑜𝑢𝑡 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛, e.g., sum
• Feed ℎ𝐺 into another fully connected layer
• 𝑦𝐺 = 𝜎(𝜃𝑇
ℎ𝐺)
•Loss function
• Cross entropy loss
• In a binary classification case
• 𝑙𝐺 = 𝑦𝐺𝑙𝑜𝑔𝑦𝐺 + (1 − 𝑦𝐺)log(1 − 𝑦𝐺)
28
Graph Classification Applications
•Chemical compounds
• Toxic or not
•Proteins
• Has certain function or not
•Program-derived graphs
• Contains bugs or not
29
4. Graph Similarity Computation
•Decoder using ℎ𝐺= 𝑔( 𝑧𝑣 𝑣∈𝑉)
• Given a graph pair 𝐺1, 𝐺2
• Determine its score 𝑠𝐺1𝐺2
= ℎ𝐺1
𝑇
𝑅ℎ𝐺2
•Loss function
• E.g., Square loss
• 𝑙𝐺1𝐺2
= (𝑦𝐺1𝐺2
−𝑠𝐺1𝐺2
)^2
30
A Concrete solution by SimGNN [Bai et al., AAAI
2019]
• Goal: learn a GNN 𝜙: 𝒢 × 𝒢 → 𝑅+
to approximate
Graph Edit Distance between two graphs
31
1. Attention-based graph-level embedding
2. Histogram features from pairwise node similarities
Graph Similarity Computation Applications
•Drug database
• Drug similarity search
•Program database
• Code recommendation
• Search ninja code for novice code
• Search java code for COBOL code
32
Outline
• Introduction
• Graph Neural Networks
• Downstream Tasks for Graphs
• Applications in PL
• Discussions
33
Deep Learning in PL
•Programs as sequences
• Follow NLP techniques
•Programs as trees
• Ye et al., Context-Aware ParseTrees, 2020
34
Programs as graphs
• Ben-Nun et al., Neural Code Comprehension: A Learnable
Representation of Code Semantics, NeurIPS 2018
• Then shallow embedding: inst2vec
35
GNNs in PL
•Allamanis et al., Learning to represent
programs with graphs, ICLR 2018
• Tasks: (1) predict the name of a variable; (2)
predict the right variable for a given location
• Methodology: Gated GNN
36
GNNs in PL (Cont.)
• Li et al., Graph Matching Networks
for Learning the Similarity of
Graph Structured Objects, ICML
2019
• Tasks: decide whether two
functions are the same or not
• Methodology: extend
message passing across
different graphs
37
Control flow graph
for functions
Outline
• Introduction
• Graph Neural Networks
• Downstream Tasks for Graphs
• Applications in PL
• Discussions
38
Open Questions
•How to represent programs into graphs?
• Iyer, Sun, Wang, Gottschlich, Software
Language Comprehension using a Program-
Derived Semantic Graph, arXiv:2004.00768
• Capture program semantics at many levels of
granularity
• A hierarchical graph
39
Open Questions
•Why GNNs work?
• Is the nonlinear transformation necessary?
• Chen et al., Are Powerful Graph Neural Nets
Necessary? A Dissection on Graph
Classification, arXiv:1905.04579
• A concatenate feature vector from graph
propagation, followed by a MLP works equally
well, and much faster!
40
Q & A
• Thanks to my collaborators:
• Yunsheng Bai, Wei Wang, Derek Xu, Hao Ding, Ting Chen, Ziniu
Hu, etc…
• Thanks to my funding agencies and industry support:
• NSF, DARPA, PPDAI, Yahoo!, Nvidia, Snapchat, Amazon,
Okawa Foundation
41

More Related Content

Similar to Sun_MAPL_GNN.pptx

Large graph analysis using g mine system
Large graph analysis using g mine systemLarge graph analysis using g mine system
Large graph analysis using g mine system
saujog
 
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
thanhdowork
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Preferred Networks
 
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En..."Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
ssuser2624f71
 
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
ssuser4b1f48
 
240506_JW_labseminar[Structural Deep Network Embedding].pptx
240506_JW_labseminar[Structural Deep Network Embedding].pptx240506_JW_labseminar[Structural Deep Network Embedding].pptx
240506_JW_labseminar[Structural Deep Network Embedding].pptx
thanhdowork
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
tm1966
 
node2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptxnode2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptx
ssuser2624f71
 
201907 AutoML and Neural Architecture Search
201907 AutoML and Neural Architecture Search201907 AutoML and Neural Architecture Search
201907 AutoML and Neural Architecture Search
DaeJin Kim
 
How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supe...
How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supe...How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supe...
How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supe...
ssuser2624f71
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
ssuser2624f71
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
MLconf
 
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
ssuser4b1f48
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
sun peiyuan
 
Anguiano varshneya sccur-poster_20141122
Anguiano varshneya sccur-poster_20141122Anguiano varshneya sccur-poster_20141122
Anguiano varshneya sccur-poster_20141122
GRNsight
 
Southwick anguiano lmu-symposium_presentation_20140329
Southwick anguiano lmu-symposium_presentation_20140329Southwick anguiano lmu-symposium_presentation_20140329
Southwick anguiano lmu-symposium_presentation_20140329
GRNsight
 
Ire presentation
Ire presentationIre presentation
Ire presentation
Raj Patel
 
Understanding Large Social Networks | IRE Major Project | Team 57 | LINE
Understanding Large Social Networks | IRE Major Project | Team 57 | LINEUnderstanding Large Social Networks | IRE Major Project | Team 57 | LINE
Understanding Large Social Networks | IRE Major Project | Team 57 | LINE
Raj Patel
 
Presentation
PresentationPresentation
Presentation
Peyman Faizian
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
ivaderivader
 

Similar to Sun_MAPL_GNN.pptx (20)

Large graph analysis using g mine system
Large graph analysis using g mine systemLarge graph analysis using g mine system
Large graph analysis using g mine system
 
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
 
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En..."Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...
 
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...NS-CUK Seminar: H.E.Lee,  Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
 
240506_JW_labseminar[Structural Deep Network Embedding].pptx
240506_JW_labseminar[Structural Deep Network Embedding].pptx240506_JW_labseminar[Structural Deep Network Embedding].pptx
240506_JW_labseminar[Structural Deep Network Embedding].pptx
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
node2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptxnode2vec: Scalable Feature Learning for Networks.pptx
node2vec: Scalable Feature Learning for Networks.pptx
 
201907 AutoML and Neural Architecture Search
201907 AutoML and Neural Architecture Search201907 AutoML and Neural Architecture Search
201907 AutoML and Neural Architecture Search
 
How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supe...
How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supe...How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supe...
How To Find Your Friendly Neighborhood: Graph Attention Design With Self-Supe...
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer ...
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
 
Anguiano varshneya sccur-poster_20141122
Anguiano varshneya sccur-poster_20141122Anguiano varshneya sccur-poster_20141122
Anguiano varshneya sccur-poster_20141122
 
Southwick anguiano lmu-symposium_presentation_20140329
Southwick anguiano lmu-symposium_presentation_20140329Southwick anguiano lmu-symposium_presentation_20140329
Southwick anguiano lmu-symposium_presentation_20140329
 
Ire presentation
Ire presentationIre presentation
Ire presentation
 
Understanding Large Social Networks | IRE Major Project | Team 57 | LINE
Understanding Large Social Networks | IRE Major Project | Team 57 | LINEUnderstanding Large Social Networks | IRE Major Project | Team 57 | LINE
Understanding Large Social Networks | IRE Major Project | Team 57 | LINE
 
Presentation
PresentationPresentation
Presentation
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
 

Recently uploaded

Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
171ticu
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
enizeyimana36
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
University of Maribor
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 

Recently uploaded (20)

Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball playEric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
Eric Nizeyimana's document 2006 from gicumbi to ttc nyamata handball play
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 

Sun_MAPL_GNN.pptx

  • 1. A GENTLE TUTORIAL ON GRAPH NEURAL NETWORKS AND ITS APPLICATION TO PROGRAMMING LANGUAGE Yizhou Sun Department of Computer Science University of California, Los Angeles yzsun@cs.ucla.edu August 29, 2023
  • 2. Outline • Introduction • Graph Neural Networks • Downstream Tasks for Graphs • Applications in PL • Discussions 1
  • 3. Graph Analysis •Graphs are ubiquitous • Social networks • Proteins • Chemical compounds • Program dependence graph • ... 2
  • 4. Representing Nodes and Graphs •Important for many graph related tasks •Discrete nature makes it very challenging •Naïve solutions Limitations: Extremely High-dimensional No global structure information integrated Permutation-variant
  • 5. Even more challenging for graph representation •Ex. Graphlet-based feature vector 4 Source: https://haotang1995.github.io/projects/robust _graph_level_representation_learning_using_g raph_based_structural_attentional_learning Source: DOI: 10.1093/bioinformatics/btv130 Requires subgraph isomorphism test: NP-hard
  • 6. Automatic representation Learning •Map each node/graph into a low dimensional vector • 𝜙: 𝑉 → 𝑅𝑑 or 𝜙: 𝒢 → 𝑅𝑑 •Earlier methods • Shallow node embedding methods inspired by word2vec • DeepWalk [Perozzi, KDD’14] • LINE [Tang, WWW’15] • Node2Vec [Grover, KDD’16] 5 Source: DeepWalk 𝝓 𝒗 = 𝑼𝑻 𝒙𝒗, where U is the embedding matrix and 𝒙𝒗 is the one-hot encoding vector
  • 7. Limitation of shallow embedding techniques •Too many parameters • Each node is associated with an embedding vector, which are parameters •Not inductive • Cannot handle new nodes •Cannot handle node attributes 6
  • 8. From shallow embedding to Graph Neural Networks •The embedding function (encoder) is more complicated • Shallow embedding • 𝜙 𝑣 = 𝑈𝑇𝑥𝑣, where U is the embedding matrix and 𝑥𝑣 is the one-hot encoding vector • Graph neural networks • 𝜙 𝑣 is a neural network depending on the graph structure 7
  • 9. Outline • Introduction • Graph Neural Networks • Downstream Tasks for Graphs • Applications in PL • Discussions 8
  • 10. Notations •An attributed graph 𝐺 = (𝑉, 𝐸) • 𝑉: vertex set • 𝐸: edge set • 𝐴: adjacency matrix • 𝑋 ∈ 𝑅𝑑0×|𝑉| : feature matrix for all the nodes • 𝑁(𝑣): neighbors of node 𝑣 • ℎ𝑣 𝑙 : Representation vector of node 𝑣 at Layer 𝑙 • Note ℎ𝑣 0 = 𝑥𝑣 • 𝐻𝑙 ∈ 𝑅𝑑𝑙×|𝑉| : representation matrix 9
  • 11. The General Architecture of GNNs •For a node v at layer t • A function of representations of neighbors and itself from previous layers • Aggregation of neighbors • Transformation to a different space • Combination of neighbors and the node itself 10 representation vector from previous layer for node v representation vectors from previous layer for node v’s neighbors
  • 12. Compare with CNN •Recall CNN • Regular graph •GNN • Extend to irregular graph structure 11
  • 13. Graph Convolutional Network (GCN) 12 •Kipf and Welling, ICLR’17 • , 𝐴 = 𝐴 + 𝐼 • f: graph filter •From a node v’s perspective 𝑾𝒌: 𝒘𝒆𝒊𝒈𝒉𝒕 𝒎𝒂𝒕𝒓𝒊𝒙 𝒂𝒕 𝑳𝒂𝒚𝒆𝒓 𝒌, 𝒔𝒉𝒂𝒓𝒆𝒅 𝒂𝒄𝒓𝒐𝒔𝒔 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒕 𝒏𝒐𝒅𝒆𝒔
  • 14. A toy example of 2-layer GCN on a 4-node graph •Computation graph 13 13
  • 15. GraphSAGE • Inductive Representation Learning on Large Graphs William L. Hamilton*, Rex Ying*, Jure Leskovec, NeurIPS’17 14 A more general form
  • 16. More about AGG •Mean •LSTM • 𝜋 ⋅ : 𝑎 𝑟𝑎𝑛𝑑𝑜𝑚 𝑝𝑒𝑟𝑚𝑢𝑡𝑎𝑡𝑖𝑜𝑛 •Pool • 𝛾 ⋅ : Element-wise mean/max pooling of neighbor set 15 = 𝛾
  • 17. Message-Passing Neural Network • Gilmer et al., 2017. Neural Message Passing for Quantum Chemistry. ICML. • A general framework that subsumes most GNNs • Can also include edge information • Two steps • Get messages from neighbors at step k • Update the node latent represent based on the msg 16 e.g., Sum or MLP e.g., LSTM, GRU 𝑨 𝒔𝒑𝒆𝒄𝒊𝒂𝒍 𝒄𝒂𝒔𝒆: 𝑮𝑮𝑵𝑵, Li et al., Gated graph sequence neural networks, ICLR 2015
  • 18. Graph Attention Network (GAN) •How to decide the importance of neighbors? • GCN: a predefined weight • Others: no differentiation •GAN: decide the weights using learnable attention • Velickovic et al., 2018. Graph Attention Networks. ICLR. 17
  • 19. The attention mechanism •Potentially many possible designs 18
  • 20. Heterogeneous Graph Transformer (HGT) •How to handle heterogeneous types of nodes and relations? • Introduce different weight matrices for different types of nodes and relations • Introduce different attention weight matrices for different types of nodes and relations 19 𝑡 𝑠2 𝑠1 Paper Author Paper Cite Write Hu et al., Heterogeneous Graph Transformer, WWW’20
  • 21. Meta-Relation-based Parametrization •Introduce node- and edge- dependent parameterization • Leverage meta relation <source node type, edge type, target node type> to parameterize attention and message passing weight. 20 W<Author, Write, Paper> =WAuthor WWrite WPaper W<Paper, Cite, Paper> =WPaper WCite WPaper 𝑠2 𝑠1 Author Paper Cite Write 𝑡
  • 22. Meta-Relation-based Attention •Attention learning is also parameterized based on node type and link type 21
  • 23. Outline • Introduction • Graph Neural Networks • Downstream Tasks for Graphs • Applications in PL • Discussions 22
  • 24. Typical Graph Functions •Node level • Similarity search • Link prediction • Classification • Community detection • Ranking 23 •Graph level • Similarity search • Frequent pattern mining • Graph isomorphism test • Graph matching • Classification • Clustering • Graph generation
  • 25. 1. Semi-supervised Node Classification •Decoder using 𝑧𝑣 = ℎ𝑣 𝐿 • Feed into another fully connected layer • 𝑦𝑣 = 𝜎(𝜃𝑇 𝑧𝑣) •Loss function • Cross entropy loss • In a binary classification case • 𝑙𝑣 = 𝑦𝑣𝑙𝑜𝑔𝑦𝑣 + (1 − 𝑦𝑣)log(1 − 𝑦𝑣) 24
  • 26. Applications of Node Classification •Social network • An account is bot or not •Citation network • A paper’s research field •A program-derived graph • The type of a variable 25
  • 27. 2. Link Prediction •Decoder using 𝑧𝑣 = ℎ𝑣 𝐿 • Given a node pair 𝑢, 𝑣 • Determine its probability 𝑝𝑢𝑣 = 𝑧𝑢 𝑇 𝑅𝑧𝑣 • R could be different for different relation type •Loss function • Cross entropy loss • 𝑙𝑢𝑣 = 𝑦𝑢𝑣𝑙𝑜𝑔𝑝𝑢𝑣 + (1 − 𝑦𝑢𝑣)log(1 − 𝑝𝑢𝑣) 26
  • 28. Link Prediction Applications •Social network • Friend recommendation •Citation network • Citation recommendation •Medical network • Drug and target binding or not •A program-derived graph • Code autocomplete 27
  • 29. 3. Graph Classification •Decoder using ℎ𝐺 = 𝑔( 𝑧𝑣 𝑣∈𝑉) • 𝑔 ⋅ : 𝑎 𝑟𝑒𝑎𝑑 𝑜𝑢𝑡 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛, e.g., sum • Feed ℎ𝐺 into another fully connected layer • 𝑦𝐺 = 𝜎(𝜃𝑇 ℎ𝐺) •Loss function • Cross entropy loss • In a binary classification case • 𝑙𝐺 = 𝑦𝐺𝑙𝑜𝑔𝑦𝐺 + (1 − 𝑦𝐺)log(1 − 𝑦𝐺) 28
  • 30. Graph Classification Applications •Chemical compounds • Toxic or not •Proteins • Has certain function or not •Program-derived graphs • Contains bugs or not 29
  • 31. 4. Graph Similarity Computation •Decoder using ℎ𝐺= 𝑔( 𝑧𝑣 𝑣∈𝑉) • Given a graph pair 𝐺1, 𝐺2 • Determine its score 𝑠𝐺1𝐺2 = ℎ𝐺1 𝑇 𝑅ℎ𝐺2 •Loss function • E.g., Square loss • 𝑙𝐺1𝐺2 = (𝑦𝐺1𝐺2 −𝑠𝐺1𝐺2 )^2 30
  • 32. A Concrete solution by SimGNN [Bai et al., AAAI 2019] • Goal: learn a GNN 𝜙: 𝒢 × 𝒢 → 𝑅+ to approximate Graph Edit Distance between two graphs 31 1. Attention-based graph-level embedding 2. Histogram features from pairwise node similarities
  • 33. Graph Similarity Computation Applications •Drug database • Drug similarity search •Program database • Code recommendation • Search ninja code for novice code • Search java code for COBOL code 32
  • 34. Outline • Introduction • Graph Neural Networks • Downstream Tasks for Graphs • Applications in PL • Discussions 33
  • 35. Deep Learning in PL •Programs as sequences • Follow NLP techniques •Programs as trees • Ye et al., Context-Aware ParseTrees, 2020 34
  • 36. Programs as graphs • Ben-Nun et al., Neural Code Comprehension: A Learnable Representation of Code Semantics, NeurIPS 2018 • Then shallow embedding: inst2vec 35
  • 37. GNNs in PL •Allamanis et al., Learning to represent programs with graphs, ICLR 2018 • Tasks: (1) predict the name of a variable; (2) predict the right variable for a given location • Methodology: Gated GNN 36
  • 38. GNNs in PL (Cont.) • Li et al., Graph Matching Networks for Learning the Similarity of Graph Structured Objects, ICML 2019 • Tasks: decide whether two functions are the same or not • Methodology: extend message passing across different graphs 37 Control flow graph for functions
  • 39. Outline • Introduction • Graph Neural Networks • Downstream Tasks for Graphs • Applications in PL • Discussions 38
  • 40. Open Questions •How to represent programs into graphs? • Iyer, Sun, Wang, Gottschlich, Software Language Comprehension using a Program- Derived Semantic Graph, arXiv:2004.00768 • Capture program semantics at many levels of granularity • A hierarchical graph 39
  • 41. Open Questions •Why GNNs work? • Is the nonlinear transformation necessary? • Chen et al., Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification, arXiv:1905.04579 • A concatenate feature vector from graph propagation, followed by a MLP works equally well, and much faster! 40
  • 42. Q & A • Thanks to my collaborators: • Yunsheng Bai, Wei Wang, Derek Xu, Hao Ding, Ting Chen, Ziniu Hu, etc… • Thanks to my funding agencies and industry support: • NSF, DARPA, PPDAI, Yahoo!, Nvidia, Snapchat, Amazon, Okawa Foundation 41

Editor's Notes

  1. Protein graph source: http://vishgraph.mbu.iisc.ernet.in/GraProStr/ Control dependence and data dependence; https://www.pcchem.net/chemical-compounds.html
  2. Siamese structure
  3. Aroma system and its simplified parse tree (SPT)