SlideShare a Scribd company logo
DeepWalk: Online Learning of
Social Representations
Bryan Perozzi, Rami Al-Rfou, Steven Skiena
Stony Brook University
ACM SIG-KDD
August 26, 2014
Outline
Bryan Perozzi DeepWalk: Online Learning of Social Representations
● Introduction: Graphs as Features
● Language Modeling
● DeepWalk
● Evaluation: Network Classification
● Conclusions & Future Work
Features From Graphs
● Anomaly Detection
● Attribute Prediction
● Clustering
● Link Prediction
● ...
AdjacencyMatrix
|V|
Bryan Perozzi DeepWalk: Online Learning of Social Representations
A first step in machine learning for graphs is to
extract graph features:
● node: degree
● pairs: # of common neighbors
● groups: cluster assignments
What is a Graph Representation?
● Anomaly Detection
● Attribute Prediction
● Clustering
● Link Prediction
● ...
|V| d << |V|
Latent Dimensions
AdjacencyMatrix
Bryan Perozzi DeepWalk: Online Learning of Social Representations
We can also create features by transforming the
graph into a lower dimensional latent
representation.
DeepWalk
● Anomaly Detection
● Attribute Prediction
● Clustering
● Link Prediction
● ...
|V|
DeepWalk
d << |V|
Latent Dimensions
AdjacencyMatrix
Bryan Perozzi DeepWalk: Online Learning of Social Representations
DeepWalk learns a latent representation of
adjacency matrices using deep learning
techniques developed for language modeling.
Visual Example
Bryan Perozzi DeepWalk: Online Learning of Social Representations
On Zachary’s Karate Graph:
Input Output
Advantages of DeepWalk
● Anomaly Detection
● Attribute Prediction
● Clustering
● Link Prediction
● ...
|V|
DeepWalk
d << |V|
Latent Dimensions
AdjacencyMatrix
Bryan Perozzi DeepWalk: Online Learning of Social Representations
● Scalable - An online algorithm that does not
use entire graph at once
● Walks as sentences metaphor
● Works great!
● Implementation available: bit.ly/deepwalk
Outline
Bryan Perozzi DeepWalk: Online Learning of Social Representations
● Introduction: Graphs as Features
● Language Modeling
● DeepWalk
● Evaluation: Network Classification
● Conclusions & Future Work
Language Modeling
Learning a
representation
means learning a
mapping function
from word co-
occurrence
Bryan Perozzi DeepWalk: Online Learning of Social Representations
[Rumelhart+, 2003]
We hope that the learned
representations capture
inherent structure
[Baroni et al, 2009]
World of Word Embeddings
This is a very active research topic in NLP.
• Importance sampling and hierarchical classification were proposed to speed up training.
[F. Morin and Y.Bengio, AISTATS 2005] [Y. Bengio and J. Sencal, IEEENN 2008] [A. Mnih, G.
Hinton, NIPS 2008]
• NLP applications based on learned representations.
[Colbert et al. NLP (Almost) from Scratch, (JMLR), 2011.]
• Recurrent networks were proposed to learn sequential representations.
[Tomas Mikolov et al. ICASSP 2011]
• Composed representations learned through recursive networks were used for parsing,
paraphrase detection, and sentiment analysis.
[ R. Socher, C. Manning, A. Ng, EMNLP (2011, 2012, 2013) NIPS (2011, 2012) ACL (2012,
2013) ]
• Vector spaces of representations are developed to simplify compositionality.
[ T. Mikolov, G. Corrado, K. Chen and J. Dean, ICLR 2013, NIPS 2013]
Word Frequency in Natural Language
Co-OccurrenceMatrix
■ Words frequency in a natural
language corpus follows a power
law.
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Connection: Power Laws
Vertex frequency in random walks on
scale free graphs also follows a power
law.
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Vertex Frequency in SFG
■ Short truncated random walks
are sentences in an artificial
language!
■ Random walk distance is
known to be good features for
many problems
Scale Free Graph
Bryan Perozzi DeepWalk: Online Learning of Social Representations
The Cool Idea
Short random walks =
sentences
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Outline
Bryan Perozzi DeepWalk: Online Learning of Social Representations
● Introduction: Graphs as Features
● Language Modeling
● DeepWalk
● Evaluation: Network Classification
● Conclusions & Future Work
Deep Learning for Networks
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Input: Graph
Random Walks
Representation Mapping
2
1 3
4 5Hierarchical Softmax Output: Representation
Deep Learning for Networks
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Input: Graph
Random Walks
Representation Mapping
2
1 3
4 5Hierarchical Softmax Output: Representation
Random Walks
■ We generate random walks for each vertex
in the graph.
■ Each short random walk has length .
■ Pick the next step uniformly from the vertex
neighbors.
■ Example:
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Deep Learning for Networks
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Input: Graph
Random Walks
Representation Mapping
2
1 3
4 5Hierarchical Softmax Output: Representation
Representation Mapping
Bryan Perozzi DeepWalk: Online Learning of Social Representations
■ Map the vertex under focus ( ) to
its representation.
■ Define a window of size
■ If = 1 and =
Maximize:
Deep Learning for Networks
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Random Walks
2
1 3
4 5
Input: Graph Representation Mapping
Hierarchical Softmax Output: Representation
Hierarchical Softmax
Calculating involves O(V) operations
for each update! Instead:
Bryan Perozzi DeepWalk: Online Learning of Social Representations
● Consider the graph vertices
as leaves of a balanced
binary tree.
● Maximizing
is equivalent to maximizing
the probability of the path
from the root to the node.
specifically, maximizingC1
C2
C3
Each of {C1,
C2,
C3
} is a
logistic binary classifier.
Learning
■ Learned parameters:
■ Vertex representations
■ Tree binary classifiers weights
■ Randomly initialize the representations.
■ For each {C1,
C2,
C3
} calculate the loss
function.
■ Use Stochastic Gradient Descent to update
both the classifier weights and the vertex
representation simultaneously.
Bryan Perozzi DeepWalk: Online Learning of Social Representations
[Mikolov+, 2013]
Deep Learning for Networks
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Input: Graph
Random Walks
Representation Mapping
2
1 3
4 5Hierarchical Softmax Output: Representation
Outline
Bryan Perozzi DeepWalk: Online Learning of Social Representations
● Introduction: Graphs as Features
● Language Modeling
● DeepWalk
● Evaluation: Network Classification
● Conclusions & Future Work
Attribute Prediction
The Semi-Supervised Network Classification
problem:
Bryan Perozzi DeepWalk: Online Learning of Social Representations
2
1
3
4
6
5
Stony Brook
studentsGooglers
INPUT
A partially labelled graph with node
attributes.
OUTPUT
Attributes for nodes which do not
have them.
Baselines
■ Approximate Inference Techniques:
❑ weighted vote Relational Neighbor (wvRN)
■ Latent Dimensions
❑ Spectral Methods
■ SpectralClustering [Tang+, ‘11]
■ MaxModularity [Tang+, ‘09]
❑ k-means
■ EdgeCluster [Tang+, ‘09]
Bryan Perozzi DeepWalk: Online Learning of Social Representations
[Macskassy+, ‘03]
Results: BlogCatalog
Bryan Perozzi
DeepWalk performs well, especially when labels are
sparse.
Results: Flickr
Bryan Perozzi
Results: YouTube
Bryan Perozzi
Spectral Methods do not scale to large graphs.
Parallelization
Bryan Perozzi DeepWalk: Online Learning of Social Representations
● Parallelization doesn’t affect representation quality.
● The sparser the graph, the easier to achieve linear
scalability. (Feng+, NIPS ‘11)
Outline
Bryan Perozzi DeepWalk: Online Learning of Social Representations
● Introduction: Graphs as Features
● Language Modeling
● DeepWalk
● Evaluation: Network Classification
● Conclusions & Future Work
Variants / Future Work
■ Streaming
❑ No need to ever store entire graph
❑ Can build & update representation as new data
comes in.
Bryan Perozzi DeepWalk: Online Learning of Social Representations
■ “Non-Random” Walks
❑ Many graphs occur through as a by-product of
interactions
❑ One could outside processes (users, etc) to feed
the modeling phase
❑ [This is what language modeling is doing]
Take-away
Language Modeling techniques can be
used for online learning of network
representations.
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Thanks!
Bryan Perozzi DeepWalk: Online Learning of Social Representations
Bryan Perozzi
@phanein
bperozzi@cs.stonybrook.edu
DeepWalk available at:
http://bit.ly/deepwalk

More Related Content

What's hot

Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
Mustafa Yagmur
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
Sanghoon Hong
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
Usman Qayyum
 
Deepwalk vs Node2vec
Deepwalk vs Node2vecDeepwalk vs Node2vec
Deepwalk vs Node2vec
SiddhantVerma49
 
Graph Kernelpdf
Graph KernelpdfGraph Kernelpdf
Graph Kernelpdf
pratik shukla
 
Deep Learning for Graphs
Deep Learning for GraphsDeep Learning for Graphs
Deep Learning for Graphs
DeepLearningBlr
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
Appsilon Data Science
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Sungchul Kim
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Sujit Pal
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
岳華 杜
 
Word2Vec
Word2VecWord2Vec
Word2Vec
hyunyoung Lee
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
Dat Nguyen
 
INTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUINTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRU
Sri Geetha
 
Shap
ShapShap
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Liad Magen
 
Lecture_16_Self-supervised_Learning.pptx
Lecture_16_Self-supervised_Learning.pptxLecture_16_Self-supervised_Learning.pptx
Lecture_16_Self-supervised_Learning.pptx
Karimdabbabi
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
Dong Heon Cho
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine Learning
Databricks
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
남주 김
 

What's hot (20)

Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
Deepwalk vs Node2vec
Deepwalk vs Node2vecDeepwalk vs Node2vec
Deepwalk vs Node2vec
 
Graph Kernelpdf
Graph KernelpdfGraph Kernelpdf
Graph Kernelpdf
 
Deep Learning for Graphs
Deep Learning for GraphsDeep Learning for Graphs
Deep Learning for Graphs
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
INTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUINTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRU
 
Shap
ShapShap
Shap
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetup
 
Lecture_16_Self-supervised_Learning.pptx
Lecture_16_Self-supervised_Learning.pptxLecture_16_Self-supervised_Learning.pptx
Lecture_16_Self-supervised_Learning.pptx
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine Learning
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 

Similar to DeepWalk: Online Learning of Representations

Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
JOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured Data
Jordan Open Source Association
 
Graphs for Ai and ML
Graphs for Ai and MLGraphs for Ai and ML
Graphs for Ai and ML
Neo4j
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
Preferred Networks
 
GraphTour Boston - Graphs for AI and ML
GraphTour Boston - Graphs for AI and MLGraphTour Boston - Graphs for AI and ML
GraphTour Boston - Graphs for AI and ML
Neo4j
 
Visual Network Narrations
Visual Network NarrationsVisual Network Narrations
Visual Network Narrations
Janna Joceli Omena
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
NAVER Engineering
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
Benjamin Le
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
Anuj Gupta
 
Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media Analytics
NYC Predictive Analytics
 
The 2nd graph database in sv meetup
The 2nd graph database in sv meetupThe 2nd graph database in sv meetup
The 2nd graph database in sv meetup
Joshua Bae
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
Andre Freitas
 
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RGentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Marco Wirthlin
 
3D Environment HOMENavi
3D Environment HOMENavi3D Environment HOMENavi
3D Environment HOMENavi
RLKorea
 
Machine Learning in the Cloud with GraphLab
Machine Learning in the Cloud with GraphLabMachine Learning in the Cloud with GraphLab
Machine Learning in the Cloud with GraphLab
Danny Bickson
 
Improving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph AlgorithmsImproving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph Algorithms
Neo4j
 
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
thanhdowork
 
Metrics in virtual worlds
Metrics in virtual worldsMetrics in virtual worlds
Metrics in virtual worlds
Michael Vallance
 
Attentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsAttentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene Graphs
Sangmin Woo
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
Paris Open Source Summit
 

Similar to DeepWalk: Online Learning of Representations (20)

Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
JOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured Data
 
Graphs for Ai and ML
Graphs for Ai and MLGraphs for Ai and ML
Graphs for Ai and ML
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 
GraphTour Boston - Graphs for AI and ML
GraphTour Boston - Graphs for AI and MLGraphTour Boston - Graphs for AI and ML
GraphTour Boston - Graphs for AI and ML
 
Visual Network Narrations
Visual Network NarrationsVisual Network Narrations
Visual Network Narrations
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media Analytics
 
The 2nd graph database in sv meetup
The 2nd graph database in sv meetupThe 2nd graph database in sv meetup
The 2nd graph database in sv meetup
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RGentle Introduction: Bayesian Modelling and Probabilistic Programming in R
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in R
 
3D Environment HOMENavi
3D Environment HOMENavi3D Environment HOMENavi
3D Environment HOMENavi
 
Machine Learning in the Cloud with GraphLab
Machine Learning in the Cloud with GraphLabMachine Learning in the Cloud with GraphLab
Machine Learning in the Cloud with GraphLab
 
Improving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph AlgorithmsImproving Machine Learning using Graph Algorithms
Improving Machine Learning using Graph Algorithms
 
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
[20240304_LabSeminar_Huy]DeepWalk: Online Learning of Social Representations....
 
Metrics in virtual worlds
Metrics in virtual worldsMetrics in virtual worlds
Metrics in virtual worlds
 
Attentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene GraphsAttentive Relational Networks for Mapping Images to Scene Graphs
Attentive Relational Networks for Mapping Images to Scene Graphs
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 

Recently uploaded

SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
PirithiRaju
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
Texas Alliance of Groundwater Districts
 

Recently uploaded (20)

SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
 

DeepWalk: Online Learning of Representations

  • 1. DeepWalk: Online Learning of Social Representations Bryan Perozzi, Rami Al-Rfou, Steven Skiena Stony Brook University ACM SIG-KDD August 26, 2014
  • 2. Outline Bryan Perozzi DeepWalk: Online Learning of Social Representations ● Introduction: Graphs as Features ● Language Modeling ● DeepWalk ● Evaluation: Network Classification ● Conclusions & Future Work
  • 3. Features From Graphs ● Anomaly Detection ● Attribute Prediction ● Clustering ● Link Prediction ● ... AdjacencyMatrix |V| Bryan Perozzi DeepWalk: Online Learning of Social Representations A first step in machine learning for graphs is to extract graph features: ● node: degree ● pairs: # of common neighbors ● groups: cluster assignments
  • 4. What is a Graph Representation? ● Anomaly Detection ● Attribute Prediction ● Clustering ● Link Prediction ● ... |V| d << |V| Latent Dimensions AdjacencyMatrix Bryan Perozzi DeepWalk: Online Learning of Social Representations We can also create features by transforming the graph into a lower dimensional latent representation.
  • 5. DeepWalk ● Anomaly Detection ● Attribute Prediction ● Clustering ● Link Prediction ● ... |V| DeepWalk d << |V| Latent Dimensions AdjacencyMatrix Bryan Perozzi DeepWalk: Online Learning of Social Representations DeepWalk learns a latent representation of adjacency matrices using deep learning techniques developed for language modeling.
  • 6. Visual Example Bryan Perozzi DeepWalk: Online Learning of Social Representations On Zachary’s Karate Graph: Input Output
  • 7. Advantages of DeepWalk ● Anomaly Detection ● Attribute Prediction ● Clustering ● Link Prediction ● ... |V| DeepWalk d << |V| Latent Dimensions AdjacencyMatrix Bryan Perozzi DeepWalk: Online Learning of Social Representations ● Scalable - An online algorithm that does not use entire graph at once ● Walks as sentences metaphor ● Works great! ● Implementation available: bit.ly/deepwalk
  • 8. Outline Bryan Perozzi DeepWalk: Online Learning of Social Representations ● Introduction: Graphs as Features ● Language Modeling ● DeepWalk ● Evaluation: Network Classification ● Conclusions & Future Work
  • 9. Language Modeling Learning a representation means learning a mapping function from word co- occurrence Bryan Perozzi DeepWalk: Online Learning of Social Representations [Rumelhart+, 2003] We hope that the learned representations capture inherent structure [Baroni et al, 2009]
  • 10. World of Word Embeddings This is a very active research topic in NLP. • Importance sampling and hierarchical classification were proposed to speed up training. [F. Morin and Y.Bengio, AISTATS 2005] [Y. Bengio and J. Sencal, IEEENN 2008] [A. Mnih, G. Hinton, NIPS 2008] • NLP applications based on learned representations. [Colbert et al. NLP (Almost) from Scratch, (JMLR), 2011.] • Recurrent networks were proposed to learn sequential representations. [Tomas Mikolov et al. ICASSP 2011] • Composed representations learned through recursive networks were used for parsing, paraphrase detection, and sentiment analysis. [ R. Socher, C. Manning, A. Ng, EMNLP (2011, 2012, 2013) NIPS (2011, 2012) ACL (2012, 2013) ] • Vector spaces of representations are developed to simplify compositionality. [ T. Mikolov, G. Corrado, K. Chen and J. Dean, ICLR 2013, NIPS 2013]
  • 11. Word Frequency in Natural Language Co-OccurrenceMatrix ■ Words frequency in a natural language corpus follows a power law. Bryan Perozzi DeepWalk: Online Learning of Social Representations
  • 12. Connection: Power Laws Vertex frequency in random walks on scale free graphs also follows a power law. Bryan Perozzi DeepWalk: Online Learning of Social Representations
  • 13. Vertex Frequency in SFG ■ Short truncated random walks are sentences in an artificial language! ■ Random walk distance is known to be good features for many problems Scale Free Graph Bryan Perozzi DeepWalk: Online Learning of Social Representations
  • 14. The Cool Idea Short random walks = sentences Bryan Perozzi DeepWalk: Online Learning of Social Representations
  • 15. Outline Bryan Perozzi DeepWalk: Online Learning of Social Representations ● Introduction: Graphs as Features ● Language Modeling ● DeepWalk ● Evaluation: Network Classification ● Conclusions & Future Work
  • 16. Deep Learning for Networks Bryan Perozzi DeepWalk: Online Learning of Social Representations Input: Graph Random Walks Representation Mapping 2 1 3 4 5Hierarchical Softmax Output: Representation
  • 17. Deep Learning for Networks Bryan Perozzi DeepWalk: Online Learning of Social Representations Input: Graph Random Walks Representation Mapping 2 1 3 4 5Hierarchical Softmax Output: Representation
  • 18. Random Walks ■ We generate random walks for each vertex in the graph. ■ Each short random walk has length . ■ Pick the next step uniformly from the vertex neighbors. ■ Example: Bryan Perozzi DeepWalk: Online Learning of Social Representations
  • 19. Deep Learning for Networks Bryan Perozzi DeepWalk: Online Learning of Social Representations Input: Graph Random Walks Representation Mapping 2 1 3 4 5Hierarchical Softmax Output: Representation
  • 20. Representation Mapping Bryan Perozzi DeepWalk: Online Learning of Social Representations ■ Map the vertex under focus ( ) to its representation. ■ Define a window of size ■ If = 1 and = Maximize:
  • 21. Deep Learning for Networks Bryan Perozzi DeepWalk: Online Learning of Social Representations Random Walks 2 1 3 4 5 Input: Graph Representation Mapping Hierarchical Softmax Output: Representation
  • 22. Hierarchical Softmax Calculating involves O(V) operations for each update! Instead: Bryan Perozzi DeepWalk: Online Learning of Social Representations ● Consider the graph vertices as leaves of a balanced binary tree. ● Maximizing is equivalent to maximizing the probability of the path from the root to the node. specifically, maximizingC1 C2 C3 Each of {C1, C2, C3 } is a logistic binary classifier.
  • 23. Learning ■ Learned parameters: ■ Vertex representations ■ Tree binary classifiers weights ■ Randomly initialize the representations. ■ For each {C1, C2, C3 } calculate the loss function. ■ Use Stochastic Gradient Descent to update both the classifier weights and the vertex representation simultaneously. Bryan Perozzi DeepWalk: Online Learning of Social Representations [Mikolov+, 2013]
  • 24. Deep Learning for Networks Bryan Perozzi DeepWalk: Online Learning of Social Representations Input: Graph Random Walks Representation Mapping 2 1 3 4 5Hierarchical Softmax Output: Representation
  • 25. Outline Bryan Perozzi DeepWalk: Online Learning of Social Representations ● Introduction: Graphs as Features ● Language Modeling ● DeepWalk ● Evaluation: Network Classification ● Conclusions & Future Work
  • 26. Attribute Prediction The Semi-Supervised Network Classification problem: Bryan Perozzi DeepWalk: Online Learning of Social Representations 2 1 3 4 6 5 Stony Brook studentsGooglers INPUT A partially labelled graph with node attributes. OUTPUT Attributes for nodes which do not have them.
  • 27. Baselines ■ Approximate Inference Techniques: ❑ weighted vote Relational Neighbor (wvRN) ■ Latent Dimensions ❑ Spectral Methods ■ SpectralClustering [Tang+, ‘11] ■ MaxModularity [Tang+, ‘09] ❑ k-means ■ EdgeCluster [Tang+, ‘09] Bryan Perozzi DeepWalk: Online Learning of Social Representations [Macskassy+, ‘03]
  • 28. Results: BlogCatalog Bryan Perozzi DeepWalk performs well, especially when labels are sparse.
  • 30. Results: YouTube Bryan Perozzi Spectral Methods do not scale to large graphs.
  • 31. Parallelization Bryan Perozzi DeepWalk: Online Learning of Social Representations ● Parallelization doesn’t affect representation quality. ● The sparser the graph, the easier to achieve linear scalability. (Feng+, NIPS ‘11)
  • 32. Outline Bryan Perozzi DeepWalk: Online Learning of Social Representations ● Introduction: Graphs as Features ● Language Modeling ● DeepWalk ● Evaluation: Network Classification ● Conclusions & Future Work
  • 33. Variants / Future Work ■ Streaming ❑ No need to ever store entire graph ❑ Can build & update representation as new data comes in. Bryan Perozzi DeepWalk: Online Learning of Social Representations ■ “Non-Random” Walks ❑ Many graphs occur through as a by-product of interactions ❑ One could outside processes (users, etc) to feed the modeling phase ❑ [This is what language modeling is doing]
  • 34. Take-away Language Modeling techniques can be used for online learning of network representations. Bryan Perozzi DeepWalk: Online Learning of Social Representations
  • 35. Thanks! Bryan Perozzi DeepWalk: Online Learning of Social Representations Bryan Perozzi @phanein bperozzi@cs.stonybrook.edu DeepWalk available at: http://bit.ly/deepwalk