SlideShare a Scribd company logo
A Geometric Distance Oracle for Large Real-World
Graphs
Hong, Ong Xuan
Data Science School
November 16, 2017
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 1 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 2 / 30
Introduction
Explosion of available
information → Mining
information about interactions
between: Subscribers, Groups,
People, Objects, etc.
Fundamental graph
computational is computing
shortest path distance
between arbitrary nodes, but:
Slow calculating and querying
distance results.
Limited memory for storing
graph.
How to do this analysis
effectively?
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 3 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 4 / 30
Background
Graph theory.
Distance oracle.
Approximate distance.
Metric space: Euclidean, Hyperbolic.
δ - hyperbolic metric space.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 5 / 30
Graph theory
Let G(V , E) be an undirected, weighted graph, with n = |N| nodes and
m = |E| edges. What is the distance between the nodes s and t?
Dijkstra algorithm: O(m + nlogn) with Fibonacci heap, requires no
extra space.
Adjacency matrix: query time O(1), requires O(n2) extra space.
Floyd-Warshall algorithm: return all-pairs shortest paths, initialized
in time O(n3)
How to use less than O(n2) space and answer queries in less than
O(m + nlogn)?
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 6 / 30
Distance oracle
A distance oracle (constant query time) is a data structure which is
cheaper to compute, fast to query, and satisfy 4 properties:
Preprocessing time should be O(n) or O(nlogn).
Storage less than O(n2).
Query less than O(m + nlogn).
Fidelity: approximated distance as close as possible to the actual
distances.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 7 / 30
Approximate distance oracles
Using spanning trees and distance labeling for approximating distances
(Thorup and Zwick):
Preprocessing time: O(kmn1/k).
Storage: O(kn1+1/k).
Query less than O(k).
Fidelity: estimated distance vs actual distance ∈ [1, 2k − 1].
Note: k = 1, 2, logn, higher values of k do not improve the space or
preprocessing time.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 8 / 30
Metric space
Ordered pair (M, d) where M is a set and d is a metric
d : M × M → R
∀x, y, z ∈ M, the following holds:
d(x, y) ≥ 0
d(x, y) = 0 ⇐⇒ x = y
d(x, y) = d(y, x)
d(x, z) ≤ d(x, y) + d(y, z)
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 9 / 30
Euclidean distance
d(p, q) = d(q, p) = (q1 − p1)2 + (q2 − p2)2 + ... + (qn − pn)2
=
n
i=1
(qi − pi )2
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 10 / 30
Hyperbolic distance
d( x1, y1 , x2, y2 ) = arcosh(coshy1cosh(x2 − x1)coshy2 − sinhy1sinhy2)
Where:
sinhx = ex −e−x
2 (hyperbolic Sine).
coshx = ex +e−x
2 (hyperbolic Cosine).
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 11 / 30
δ - hyperbolic metric space
Given metric space (V , d) embeds into tree metric iff 4-point condition
holds:
∀w, x, y, z ∈ V :
S := S(w, x, y, z) = d(w, x) + d(y, z)
M := M(w, x, y, z) = d(x, y) + d(w, z)
L := L(w, x, y, z) = d(x, z) + d(w, y)
S ≤ M ≤ L
Then: ∀δ ≥ 0, (L − M)/2 ≤ δ
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 12 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 13 / 30
Related works
Theoretical results provide guaranteed approximation bounds for
specific graph classes:
Distance labeling in hyperbolic graphs
A Note on Distance Approximating Trees in Graphs
Additive spanners and distance and routing labeling schemes for
hyperbolic graphs
A compact routing scheme and approximate distance oracle for
power-law graphs
Reconstructing approximate tree metrics
Essays in Group Theory
Diameters, centers, and approximating trees of δ-hyperbolic geodesic
spaces and graphs
But has not been empirically evaluated on real-world graphs.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 14 / 30
Related works
Spanning trees
Quick query O(nlogn).
Reduce space storage.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 15 / 30
Related works
Developing approximate distance oracles on empirical Graphs small world
graphs, hypergrid graphs, Facebook, telecom, Google news graph, web
graph, etc.
Efficient Shortest Paths on Massive Social Graphs
Fast fully dynamic landmark-based estimation of shortest path
distances in very large graphs
Querying Shortest Path Distance with Bounded Errors in Large
Graphs
Orion: shortest path estimation for large social graphs
Approximating Shortest Paths in Social Graphs
Fast exact shortest-path distance queries on large networks by pruned
landmark labeling
Toward a distance oracle for billion-node graphs
Heuristics lack a theoretical foundation.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 16 / 30
Related works
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 17 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 18 / 30
Proposed method
Hyperbolicity-based Breath First Search (HyperBFS). Notation from graph
hyperbolicity on real world networks for developing spanning trees:
Height ≤ O(logn)
Distance queries: O(logn)
Storage O(n) words of space for an n-node graph.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 19 / 30
Algorithm
Hyperbolicity-based Tree Oracle: constructing geometric oracle
Choose highly central vertex (measure of centrality in graph based on
shortest paths) as root. But we use out degree instead (power-law
network) cause they are correlated.
Build 1-10 trees (BFS algorithm) with distinct root by ordered degree
for approximation → parallel computing distance labeling.
Distances between x and y is minimum distances in different trees
constructed.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 20 / 30
Algorithm
Set 1: Embedding graph into multi-dimensional geometric space
Mapping the nodes of the graph into points in the hyperbolic space.
Distance between two d-dimension points x = (x1, x2, ..., xd ) and
y = (y1, y2, ..., yd ) is defined as follow:
arcosh( (1 +
d
i=1
x2
i )(1 +
d
i=1
y2
i ) −
d
i=1
xi yi ).|c|
Note: no guarantees on the distance estimation error
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 21 / 30
Algorithm
Set 2: Gromov-type tree contraction: improves the accuracy of distance
estimates.
partitioning tree into i-level connected component (coalesce multiple
edges into a single edge)
additive error guaranteed not to exceed 2δlogn, where δ is the
hyperbolic constant of the graph.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 22 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 23 / 30
Evaluation
Four Bench-marked:
Gromov-type contraction-based tree.
Steiner trees with proven multiplicative bound.
Rigel: landmark-based approach.
HyperBFS: centrality-based spanning tree oracle.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 24 / 30
Setup
2.4 GHz Intel(R) Xeon(R) processor with 190GB of RAM.
Calculate distortion: Let x, y be vertices of a graph G and let dA be the
distance approximated by a distance oracle:
Additive distortion: dG − dA.
Absolute distortion: |dG − dA|.
Multiplicative distortion: |dG −dA|
dG
.
Figure: Computational Time of Hyper BFS on Call Graph II.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 25 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 26 / 30
Average absolute error
Figure: Average absolute error on various real-world graph.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 27 / 30
Average additive and multiplicative error
Figure: Average additive and multiplicative error on SantaBarbara Facebook
graph.
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 28 / 30
Contents
1 Introduction
2 Background
3 Related works
4 Proposed method
5 Evaluation
6 Results
7 Discussion
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 29 / 30
Discussion
Exact and approximate algorithms for computing the hyperbolicity of
large-scale graphs (N. Cohen, D. Coudert, A. Lancin)
Indexing and space O(nm) vs O(n).
Query O(n) vs O(logn).
Exact distance vs error bound 2δlogn.
Extending metrics:
Clustering local coefficient: Ci =
2|{eji :vj ,vk ∈Ni ,ejk ∈E}|
ki (ki −1)
Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 30 / 30

More Related Content

What's hot

H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
Sri Ambati
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
MLconf
 
How to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningHow to win data science competitions with Deep Learning
How to win data science competitions with Deep Learning
Sri Ambati
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
Alexander Korbonits
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
Albert Bifet
 
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
Sri Ambati
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
Poo Kuan Hoong
 
Deep Learning and Reinforcement Learning
Deep Learning and Reinforcement LearningDeep Learning and Reinforcement Learning
Deep Learning and Reinforcement Learning
Renārs Liepiņš
 
San Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSan Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep Learning
Sri Ambati
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image Processing
Grigory Sapunov
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Eiji Sekiya
 
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Altoros
 
H20: A platform for big math
H20: A platform for big math H20: A platform for big math
H20: A platform for big math
DataWorks Summit/Hadoop Summit
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
Albert Bifet
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
Albert Bifet
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Greg Makowski
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
Databricks
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
Albert Bifet
 
Applying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksApplying your Convolutional Neural Networks
Applying your Convolutional Neural Networks
Databricks
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Adam Rogers
 

What's hot (20)

H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
 
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SFTed Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
 
How to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningHow to win data science competitions with Deep Learning
How to win data science competitions with Deep Learning
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
Deep Learning and Reinforcement Learning
Deep Learning and Reinforcement LearningDeep Learning and Reinforcement Learning
Deep Learning and Reinforcement Learning
 
San Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSan Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep Learning
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image Processing
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
 
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
 
H20: A platform for big math
H20: A platform for big math H20: A platform for big math
H20: A platform for big math
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Applying your Convolutional Neural Networks
Applying your Convolutional Neural NetworksApplying your Convolutional Neural Networks
Applying your Convolutional Neural Networks
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 

Similar to Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị

ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
Advanced-Concepts-Team
 
L4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics CourseL4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics Course
Mohaiminur Rahman
 
Kriging interpolationtheory
Kriging interpolationtheoryKriging interpolationtheory
Kriging interpolationtheory
湘云 黄
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
Yoshihiro Nagano
 
An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling
IJECEIAES
 
Poster Final
Poster FinalPoster Final
Poster Final
Gireeshma Reddy
 
Graph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & TrendsGraph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & Trends
Luc Brun
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
IOSR Journals
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
IOSR Journals
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
Goon83
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
Tokyo Tech (Tokyo Institute of Technology)
 
AI Science
AI Science AI Science
AI Science
Melanie Swan
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
The Statistical and Applied Mathematical Sciences Institute
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structure
Rajesh Piryani
 
20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding
Jacob Xu
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
Anirbit Mukherjee
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Scientific Review SR
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Scientific Review
 
Pycon9 dibernado
Pycon9 dibernadoPycon9 dibernado
Pycon9 dibernado
GIUSEPPE DI BERNARDO
 
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
IOSR Journals
 

Similar to Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị (20)

ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
 
L4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics CourseL4 cluster analysis NWU 4.3 Graphics Course
L4 cluster analysis NWU 4.3 Graphics Course
 
Kriging interpolationtheory
Kriging interpolationtheoryKriging interpolationtheory
Kriging interpolationtheory
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
 
An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling An Effective PSO-inspired Algorithm for Workflow Scheduling
An Effective PSO-inspired Algorithm for Workflow Scheduling
 
Poster Final
Poster FinalPoster Final
Poster Final
 
Graph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & TrendsGraph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & Trends
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
 
AI Science
AI Science AI Science
AI Science
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structure
 
20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
 
Pycon9 dibernado
Pycon9 dibernadoPycon9 dibernado
Pycon9 dibernado
 
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...Improving search time for contentment based image retrieval via, LSH, MTRee, ...
Improving search time for contentment based image retrieval via, LSH, MTRee, ...
 

More from Hong Ong

Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...
Hong Ong
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Hong Ong
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
Hong Ong
 
Data Products for Mobile Commerce in Real-time and Real-life.pdf
Data Products for Mobile Commerce in Real-time and Real-life.pdfData Products for Mobile Commerce in Real-time and Real-life.pdf
Data Products for Mobile Commerce in Real-time and Real-life.pdf
Hong Ong
 
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
Hong Ong
 
Nền tảng thuật toán của AI, Machine Learning, Big Data
Nền tảng thuật toán của AI, Machine Learning, Big DataNền tảng thuật toán của AI, Machine Learning, Big Data
Nền tảng thuật toán của AI, Machine Learning, Big Data
Hong Ong
 
Bắt đầu nghiên cứu Big Data
Bắt đầu nghiên cứu Big DataBắt đầu nghiên cứu Big Data
Bắt đầu nghiên cứu Big Data
Hong Ong
 
Bắt đầu học data science
Bắt đầu học data scienceBắt đầu học data science
Bắt đầu học data science
Hong Ong
 

More from Hong Ong (8)

Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
 
Data Products for Mobile Commerce in Real-time and Real-life.pdf
Data Products for Mobile Commerce in Real-time and Real-life.pdfData Products for Mobile Commerce in Real-time and Real-life.pdf
Data Products for Mobile Commerce in Real-time and Real-life.pdf
 
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
VWS2017: Bắt đầu Big Data từ đâu và như thế nào?
 
Nền tảng thuật toán của AI, Machine Learning, Big Data
Nền tảng thuật toán của AI, Machine Learning, Big DataNền tảng thuật toán của AI, Machine Learning, Big Data
Nền tảng thuật toán của AI, Machine Learning, Big Data
 
Bắt đầu nghiên cứu Big Data
Bắt đầu nghiên cứu Big DataBắt đầu nghiên cứu Big Data
Bắt đầu nghiên cứu Big Data
 
Bắt đầu học data science
Bắt đầu học data scienceBắt đầu học data science
Bắt đầu học data science
 

Recently uploaded

Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
Hitesh Sikarwar
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 

Recently uploaded (20)

Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 

Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị

  • 1. A Geometric Distance Oracle for Large Real-World Graphs Hong, Ong Xuan Data Science School November 16, 2017 Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 1 / 30
  • 2. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 2 / 30
  • 3. Introduction Explosion of available information → Mining information about interactions between: Subscribers, Groups, People, Objects, etc. Fundamental graph computational is computing shortest path distance between arbitrary nodes, but: Slow calculating and querying distance results. Limited memory for storing graph. How to do this analysis effectively? Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 3 / 30
  • 4. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 4 / 30
  • 5. Background Graph theory. Distance oracle. Approximate distance. Metric space: Euclidean, Hyperbolic. δ - hyperbolic metric space. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 5 / 30
  • 6. Graph theory Let G(V , E) be an undirected, weighted graph, with n = |N| nodes and m = |E| edges. What is the distance between the nodes s and t? Dijkstra algorithm: O(m + nlogn) with Fibonacci heap, requires no extra space. Adjacency matrix: query time O(1), requires O(n2) extra space. Floyd-Warshall algorithm: return all-pairs shortest paths, initialized in time O(n3) How to use less than O(n2) space and answer queries in less than O(m + nlogn)? Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 6 / 30
  • 7. Distance oracle A distance oracle (constant query time) is a data structure which is cheaper to compute, fast to query, and satisfy 4 properties: Preprocessing time should be O(n) or O(nlogn). Storage less than O(n2). Query less than O(m + nlogn). Fidelity: approximated distance as close as possible to the actual distances. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 7 / 30
  • 8. Approximate distance oracles Using spanning trees and distance labeling for approximating distances (Thorup and Zwick): Preprocessing time: O(kmn1/k). Storage: O(kn1+1/k). Query less than O(k). Fidelity: estimated distance vs actual distance ∈ [1, 2k − 1]. Note: k = 1, 2, logn, higher values of k do not improve the space or preprocessing time. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 8 / 30
  • 9. Metric space Ordered pair (M, d) where M is a set and d is a metric d : M × M → R ∀x, y, z ∈ M, the following holds: d(x, y) ≥ 0 d(x, y) = 0 ⇐⇒ x = y d(x, y) = d(y, x) d(x, z) ≤ d(x, y) + d(y, z) Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 9 / 30
  • 10. Euclidean distance d(p, q) = d(q, p) = (q1 − p1)2 + (q2 − p2)2 + ... + (qn − pn)2 = n i=1 (qi − pi )2 Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 10 / 30
  • 11. Hyperbolic distance d( x1, y1 , x2, y2 ) = arcosh(coshy1cosh(x2 − x1)coshy2 − sinhy1sinhy2) Where: sinhx = ex −e−x 2 (hyperbolic Sine). coshx = ex +e−x 2 (hyperbolic Cosine). Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 11 / 30
  • 12. δ - hyperbolic metric space Given metric space (V , d) embeds into tree metric iff 4-point condition holds: ∀w, x, y, z ∈ V : S := S(w, x, y, z) = d(w, x) + d(y, z) M := M(w, x, y, z) = d(x, y) + d(w, z) L := L(w, x, y, z) = d(x, z) + d(w, y) S ≤ M ≤ L Then: ∀δ ≥ 0, (L − M)/2 ≤ δ Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 12 / 30
  • 13. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 13 / 30
  • 14. Related works Theoretical results provide guaranteed approximation bounds for specific graph classes: Distance labeling in hyperbolic graphs A Note on Distance Approximating Trees in Graphs Additive spanners and distance and routing labeling schemes for hyperbolic graphs A compact routing scheme and approximate distance oracle for power-law graphs Reconstructing approximate tree metrics Essays in Group Theory Diameters, centers, and approximating trees of δ-hyperbolic geodesic spaces and graphs But has not been empirically evaluated on real-world graphs. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 14 / 30
  • 15. Related works Spanning trees Quick query O(nlogn). Reduce space storage. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 15 / 30
  • 16. Related works Developing approximate distance oracles on empirical Graphs small world graphs, hypergrid graphs, Facebook, telecom, Google news graph, web graph, etc. Efficient Shortest Paths on Massive Social Graphs Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs Querying Shortest Path Distance with Bounded Errors in Large Graphs Orion: shortest path estimation for large social graphs Approximating Shortest Paths in Social Graphs Fast exact shortest-path distance queries on large networks by pruned landmark labeling Toward a distance oracle for billion-node graphs Heuristics lack a theoretical foundation. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 16 / 30
  • 17. Related works Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 17 / 30
  • 18. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 18 / 30
  • 19. Proposed method Hyperbolicity-based Breath First Search (HyperBFS). Notation from graph hyperbolicity on real world networks for developing spanning trees: Height ≤ O(logn) Distance queries: O(logn) Storage O(n) words of space for an n-node graph. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 19 / 30
  • 20. Algorithm Hyperbolicity-based Tree Oracle: constructing geometric oracle Choose highly central vertex (measure of centrality in graph based on shortest paths) as root. But we use out degree instead (power-law network) cause they are correlated. Build 1-10 trees (BFS algorithm) with distinct root by ordered degree for approximation → parallel computing distance labeling. Distances between x and y is minimum distances in different trees constructed. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 20 / 30
  • 21. Algorithm Set 1: Embedding graph into multi-dimensional geometric space Mapping the nodes of the graph into points in the hyperbolic space. Distance between two d-dimension points x = (x1, x2, ..., xd ) and y = (y1, y2, ..., yd ) is defined as follow: arcosh( (1 + d i=1 x2 i )(1 + d i=1 y2 i ) − d i=1 xi yi ).|c| Note: no guarantees on the distance estimation error Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 21 / 30
  • 22. Algorithm Set 2: Gromov-type tree contraction: improves the accuracy of distance estimates. partitioning tree into i-level connected component (coalesce multiple edges into a single edge) additive error guaranteed not to exceed 2δlogn, where δ is the hyperbolic constant of the graph. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 22 / 30
  • 23. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 23 / 30
  • 24. Evaluation Four Bench-marked: Gromov-type contraction-based tree. Steiner trees with proven multiplicative bound. Rigel: landmark-based approach. HyperBFS: centrality-based spanning tree oracle. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 24 / 30
  • 25. Setup 2.4 GHz Intel(R) Xeon(R) processor with 190GB of RAM. Calculate distortion: Let x, y be vertices of a graph G and let dA be the distance approximated by a distance oracle: Additive distortion: dG − dA. Absolute distortion: |dG − dA|. Multiplicative distortion: |dG −dA| dG . Figure: Computational Time of Hyper BFS on Call Graph II. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 25 / 30
  • 26. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 26 / 30
  • 27. Average absolute error Figure: Average absolute error on various real-world graph. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 27 / 30
  • 28. Average additive and multiplicative error Figure: Average additive and multiplicative error on SantaBarbara Facebook graph. Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 28 / 30
  • 29. Contents 1 Introduction 2 Background 3 Related works 4 Proposed method 5 Evaluation 6 Results 7 Discussion Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 29 / 30
  • 30. Discussion Exact and approximate algorithms for computing the hyperbolicity of large-scale graphs (N. Cohen, D. Coudert, A. Lancin) Indexing and space O(nm) vs O(n). Query O(n) vs O(logn). Exact distance vs error bound 2δlogn. Extending metrics: Clustering local coefficient: Ci = 2|{eji :vj ,vk ∈Ni ,ejk ∈E}| ki (ki −1) Hong, Ong Xuan (Data Science School) A Geometric Distance Oracle for Large Real-World GraphsNovember 16, 2017 30 / 30