SlideShare a Scribd company logo
Hierarchical representation with
hyperbolic geometry
2016-20873 Segwang Kim 1
โ‘  Embedding Symbolic and Hierarchical Data
โ‘ก Introduction to Hyperbolic Space
โ‘ข Optimization over Hyperbolic Space
โ‘ฃ Toy Experiments
Overview
2
3
Embedding Symbolic and Hierarchical Data
Symbolic and Hierarchical Data
4
Symbolic data with Implicit hierarchy.
Downstream tasks
link prediction, node classification, community detection, visualization
Wordnet Twitter Social Graph
?LINK
community
Good Hierarchical Embedding
5
For downstream tasks, symbolic and hierarchical data needs to
be embedded into space.
Good Embedding?
Embeddings of similar symbols should aggregate in some sense.
Symbolic arithmetic exists: v(King)- v(man) + v(woman)=v(Queen)
Hierarchy can be restored from embedded data.
The space should have low dimension.
6
Introduction to Hyperbolic Space
Limitation of Euclidean Embedding
7
Embed graph structure while preserving distances
Thm) Trees cannot be embedded into Euclidean space with
arbitrarily low distortion for any number of dimensions
a
b Graph Euclidean ??
D(a,b) 2 0.1 1.889
D(a,c) 2 1 1.902
D(a,d) 2 1.8 1.962
Euclidean
Graph
??
c
d
a
b
c
d
a
b
c
d
Embedding
Representation tradeoffs for hyperbolic Embeddings (ICML 2018)
Euclidean Space vs Hyperbolic space
8
๐‘€ = ๐ท ๐‘› = {๐‘ฅ โˆˆ โ„ ๐‘› โˆถ ๐‘ฅ1
2
+ โ‹ฏ + ๐‘ฅ ๐‘›
2 < 1}
(๐ท ๐‘›
,
2
1โˆ’||๐‘ฅ||2
2
๐‘”)๐‘” = ๐‘‘๐‘ฅ1 2
+ โ‹ฏ + ๐‘‘๐‘ฅ ๐‘› 2
Euclidean Hyperbolic
(โ„ ๐‘›, ๐‘”)
๐‘€ = โ„ ๐‘›
Metric tensor : inner product on tangent space
= ๐‘‘๐‘ฅ1 ๐‘ข ๐‘‘๐‘ฅ1 ๐‘ฃ + โ‹ฏ + ๐‘‘๐‘ฅ ๐‘› ๐‘ข ๐‘‘๐‘ฅ ๐‘›(๐‘ฃ)
= ๐‘ข1 ๐‘ฃ1 + โ‹ฏ + ๐‘ข ๐‘› ๐‘ฃ ๐‘›
โˆ€ ๐‘ข, ๐‘ฃ โˆˆ ๐‘‡๐‘โ„ ๐‘›
where ๐‘ โˆˆ โ„ ๐‘›
๐‘ข, ๐‘ฃ ๐‘ = ๐‘ข ๐‘ก ๐‘”๐‘ฃ
=
2
1 โˆ’ ||๐‘||2
2
(๐‘ข1 ๐‘ฃ1 + โ‹ฏ + ๐‘ข ๐‘› ๐‘ฃ ๐‘›)
โˆ€ ๐‘ข, ๐‘ฃ โˆˆ ๐‘‡๐‘ ๐ท ๐‘›
where ๐‘ โˆˆ ๐ท ๐‘›
๐‘ข, ๐‘ฃ ๐‘ = ๐‘ข ๐‘ก
(
2
1 โˆ’ ||๐‘||2
2
๐‘”)๐‘ฃ
Give Riemannian Metric
Euclidean Space vs Hyperbolic space
9
Inner product โŸจ โ‹… , โ‹… โŸฉ ๐‘ in ๐‘‡๐‘ ๐ท ๐‘› defines
Length of ๐›พ: 0,1 โ†’ ๐ท ๐‘› ๏ƒ  ๐ฟ ๐›พ = 0
1
๐›พ๐‘ก
โ€ฒ
, ๐›พ๐‘ก
โ€ฒ
๐›พ๐‘ก
1/2
๐‘‘๐‘ก
Angle between ๐‘ค1, ๐‘ค2 โˆˆ ๐‘‡๐‘ ๐ท ๐‘›
๏ƒ 
๐‘ค1,๐‘ค2 ๐‘
๐‘ค1,๐‘ค1 ๐‘โ‹… ๐‘ค2,๐‘ค2 ๐‘
1/2
Line between ๐‘, ๐‘ž โˆˆ ๐‘€ is the shortest path between them
๐›พโˆ—
= ๐‘Ž๐‘Ÿ๐‘”๐‘š๐‘–๐‘›
0
1
๐›พ๐‘ก
โ€ฒ
, ๐›พ๐‘ก
โ€ฒ
๐›พ๐‘ก
1/2
๐‘‘๐‘ก
๐›พ0 = ๐‘, ๐›พ1 = ๐‘ž
Euclidean Hyperbolic
๐‘ž
๐‘
๐‘ž
๐‘
2
1 โˆ’ ||๐‘ฅ||2
2
๐‘”
โ†’ โˆž ๐‘Ž๐‘  |๐‘ฅ| โ†’ 1
Equivalent Hyperbolic Models
10
We can choose one of Hyperbolic Models depending on purpose.
๐ท ๐‘›
= {๐‘ฅ โˆˆ โ„ ๐‘›
โˆถ ๐‘ฅ1
2
+ โ‹ฏ + ๐‘ฅ ๐‘›
2
< 1}
(๐ท ๐‘›,
2
1โˆ’||๐‘ฅ||2
2
๐‘‘๐‘ฅ1 2 + โ‹ฏ + ๐‘‘๐‘ฅ ๐‘› 2)
(๐‘ฅ0, โ€ฆ , ๐‘ฅ ๐‘›)
๏ƒ  For visualization ๏ƒ  For optimization
(
๐‘ฅ1
1 + ๐‘ฅ0
, โ€ฆ ,
๐‘ฅ ๐‘›
1 + ๐‘ฅ0
)
Poincare Model Lorentz Model
(โ„’ ๐‘›
, โˆ’๐‘‘๐‘ฅ0 2
+ ๐‘‘๐‘ฅ1 2
โ€ฆ + ๐‘‘๐‘ฅ ๐‘› 2
)
ISOMETRIC
Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry (ICML 2018)
11
Optimization Techniques
Suggested loss function
12
A Example of loss function over hyperbolic space.
Fundamentally, gradients of loss tells which direction the points
should proceed.
Poincarรฉ Embeddings for Learning Hierarchical Representations (ICML 2017)
Gradient Descent Algorithm
13
Input: ๐‘“: ๐ฟ2 โ†’ โ„, ๐‘0 โˆˆ ๐ฟ2, ๐‘˜ = 0
repeat
choose a descent direction ๐‘ฃ ๐‘˜ โˆˆ ๐‘‡๐‘ ๐‘˜
๐ฟ2
choose a retraction ๐‘… ๐‘ ๐‘˜
: ๐‘‡๐‘ ๐‘˜
๐ฟ2
โ†’ ๐ฟ2
choose a step length ๐›ผ ๐‘˜ โˆˆ โ„
set ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜
(๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜)
๐‘˜ โ† ๐‘˜ + 1
until ๐‘ ๐‘˜+1 sufficiently minimize ๐‘“
Nothing different from usual gradient descent except for
Gradient direction
Retraction
Optimization methods on Riemannian manifolds and their application to shape space (SIAM 2012)
Gradient Descent Algorithm
14
Input: ๐‘“: ๐ฟ2 โ†’ โ„, ๐‘0 โˆˆ ๐ฟ2, ๐‘˜ = 0
repeat
choose a descent direction ๐‘ฃ ๐‘˜ โˆˆ ๐‘‡๐‘ ๐‘˜
๐ฟ2
choose a retraction ๐‘… ๐‘ ๐‘˜
: ๐‘‡๐‘ ๐‘˜
๐ฟ2
โ†’ ๐ฟ2
choose a step length ๐›ผ ๐‘˜ โˆˆ โ„
set ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜
(๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜)
๐‘˜ โ† ๐‘˜ + 1
until ๐‘ ๐‘˜+1 sufficiently minimize ๐‘“
What is the gradient on Hyperbolic space?
๐‘“ โˆถ (โ„’2
, โˆ’๐‘‘๐‘ฅ0 2
+ ๐‘‘๐‘ฅ1 2
+ ๐‘‘๐‘ฅ ๐‘› 2
) โ†’ โ„
โˆ‡๐‘“ ?
Hyperboloid model
15
First, find ๐›ปโ„2:1 ๐‘“| ๐‘ โˆˆ โ„3
๐‘ . ๐‘ก. ๐›ปโ„2:1 ๐‘“| ๐‘, ๐‘ฃ
โ„’
= ๐‘‘๐‘“ ๐‘ฃ | ๐‘.
Second, project ๐›ปโ„2:1 ๐‘“| ๐‘ into ๐‘‡๐‘ ๐ฟ2.
๐›ป๐ฟ2 ๐‘“| ๐‘ = ๐›ปโ„2:1 ๐‘“| ๐‘ + ๐›ปโ„2:1 ๐‘“| ๐‘, ๐‘
โ„’
๐‘
๐‘‡๐‘ ๐ฟ2
= {๐‘ฃ โˆˆ โ„3
โˆถ ๐‘ฃ, ๐‘ โ„’ = 0}.
๐ฟ2 = {๐‘ โˆˆ โ„3: ๐‘, ๐‘ โ„’ = โˆ’1, ๐‘ ๐‘ง > 0}.
๐‘“ โˆถ (โ„’2, โˆ’๐‘‘๐‘ฅ0 2 + ๐‘‘๐‘ฅ1 2 + ๐‘‘๐‘ฅ2 2) โ†’ โ„
๐›ปโ„2:1 ๐‘“| ๐‘ = (โˆ’๐‘‘๐‘ฅ0 2 + ๐‘‘๐‘ฅ1 2 + ๐‘‘๐‘ฅ ๐‘› 2)โˆ’1 โ‹… Usual derivative
(from tensorflow)
โˆ’๐‘ฃ ๐‘˜
Gradient descent in hyperbolic space (Arxiv 2018)
Gradient Descent Algorithm
16
Input: ๐‘“: ๐ฟ2 โ†’ โ„, ๐‘0 โˆˆ ๐ฟ2, ๐‘˜ = 0
repeat
choose a descent direction ๐‘ฃ ๐‘˜ โˆˆ ๐‘‡๐‘ ๐‘˜
๐ฟ2
choose a retraction ๐‘… ๐‘ ๐‘˜
: ๐‘‡๐‘ ๐‘˜
๐ฟ2
โ†’ ๐ฟ2
choose a step length ๐›ผ ๐‘˜ โˆˆ โ„
set ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜
(๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜)
๐‘˜ โ† ๐‘˜ + 1
until ๐‘ ๐‘˜+1 sufficiently minimize ๐‘“
What is the retraction on Hyperbolic space?
Hyperboloid model
17
Retraction tells how ends points of tangent vectors correspond
to the point on manifold.
We chose affine geodesic as retraction
๐›พ๐‘ก = cosh ||๐‘ฃ||โ„’ ๐‘ก ๐‘ + sinh ||๐‘ฃ||โ„’ ๐‘ก
๐‘ฃ
||๐‘ฃ||โ„’
๐‘žโ€ฒ โˆ‰ ๐ฟ2
๐‘…(๐‘žโ€ฒ
) โˆˆ ๐ฟ2
At ๐‘ โˆˆ ๐ฟ2 with direction ๐‘ฃ โˆˆ ๐‘‡๐‘ ๐ฟ2
Gradient Descent Algorithm
18
Input: ๐‘“: ๐ฟ2 โ†’ โ„, ๐‘0 โˆˆ ๐ฟ2, ๐‘˜ = 0
repeat
choose a descent direction ๐‘ฃ ๐‘˜ โˆˆ ๐‘‡๐‘ ๐‘˜
๐ฟ2
choose a retraction ๐‘… ๐‘ ๐‘˜
: ๐‘‡๐‘ ๐‘˜
๐ฟ2
โ†’ ๐ฟ2
choose a step length ๐›ผ ๐‘˜ โˆˆ โ„
set ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜
(๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜)
๐‘˜ โ† ๐‘˜ + 1
until ๐‘ ๐‘˜+1 sufficiently minimize ๐‘“
The next point becomes
๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜
(๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜)
= cosh ||๐‘ฃ ๐‘˜||โ„’ ๐›ผ ๐‘˜ ๐‘ ๐‘˜ + sinh ||๐‘ฃ ๐‘˜||โ„’ ๐›ผ ๐‘˜
๐‘ฃ ๐‘˜
||๐‘ฃ ๐‘˜||โ„’
Simple Optimization Task1
19
GD with gradients GD with R-gradients R-GD with R-gradients
๐‘๐‘ก = ๐‘๐‘กโˆ’1 โˆ’ ๐›ผ โ‹… ๐›ป๐ธ ๐ฟ(๐‘๐‘กโˆ’1) ๐‘๐‘ก = ๐‘๐‘กโˆ’1 โˆ’ ๐›ผ โ‹… ๐›ป๐‘… ๐ฟ(๐‘๐‘กโˆ’1)
๐‘๐‘ก = ๐›พ ๐›ผ
๐›พ0 = ๐‘๐‘กโˆ’1 ๐›พ0
โ€ฒ
= ๐›ป๐‘… ๐ฟ(๐‘๐‘กโˆ’1)
3.3024998, 4.7424998,
4.7859879, 4.8213577,
4.851644, 4.8784704,
4.9028177, 4.9253302
3.3024998, 3.3081245,
3.3175893, 3.3334663,
3.3599658, 3.403821,
3.4753809, 3.5894651
3.3024998, 3.3025002,
3.3025002, 3.3025002,
3.3025005, 3.3025,
3.3025002, 3.3025005
Simple Optimization Task2
20
๐ฟ(๐‘) =
๐‘–
๐‘‘ ๐ฟ2 ๐‘, ๐‘ฅ๐‘–
2
โ€œBarycenterโ€ can be found by minimizing
Simple Optimization Task2
21
Simple Optimization Task2
22
๐ฟ(๐‘) =
๐‘–
๐‘‘ ๐ฟ2 ๐‘, ๐‘ฅ๐‘–
2
โ€œBarycenterโ€
can be found by minimizing
Takeaways
23
Hyperbolic space is promising to represent symbolic and
hierarchical datasets.
Geometry determines path toward optimal points.
Regardless of optimization technique, the optimal point is only
depends on loss function.
Interpretation: Can the path entail semantics?
Loss function over hyperbolic space should be discreetly
chosen.
Is it suitable for given geometry? Differentiable? / operation?
Unfortunately, we loose simple arithmetic.

More Related Content

What's hot

It elective-4-buendia lagua
It elective-4-buendia laguaIt elective-4-buendia lagua
It elective-4-buendia lagua
John Mark Lagua
ย 
Question 5 Math 1
Question 5 Math 1Question 5 Math 1
Question 5 Math 1
M.T.H Group
ย 
Question 4 Math 1
Question 4 Math 1Question 4 Math 1
Question 4 Math 1
M.T.H Group
ย 
Dijkstra's Algorithm
Dijkstra's Algorithm Dijkstra's Algorithm
Dijkstra's Algorithm
Rashik Ishrak Nahian
ย 
Power point vector
Power point vectorPower point vector
Power point vectorEmmanuel Alipar
ย 
Chapter 4 Integration
Chapter 4  IntegrationChapter 4  Integration
Prestation_ClydeShen
Prestation_ClydeShenPrestation_ClydeShen
Prestation_ClydeShenClyde Shen
ย 
2D transformation (Computer Graphics)
2D transformation (Computer Graphics)2D transformation (Computer Graphics)
2D transformation (Computer Graphics)
Timbal Mayank
ย 
Shortest path search for real road networks and dynamic costs with pgRouting
Shortest path search for real road networks and dynamic costs with pgRoutingShortest path search for real road networks and dynamic costs with pgRouting
Shortest path search for real road networks and dynamic costs with pgRouting
antonpa
ย 
Svm soft margin hyperplanes
Svm   soft margin hyperplanesSvm   soft margin hyperplanes
Svm soft margin hyperplanes
sarith divakar
ย 
Parallel tansport sssqrd
Parallel tansport sssqrdParallel tansport sssqrd
Parallel tansport sssqrd
foxtrot jp R
ย 
Unit 6.1
Unit 6.1Unit 6.1
Unit 6.1
Mark Ryder
ย 
Fractional Calculus A Commutative Method on Real Analytic Functions
Fractional Calculus A Commutative Method on Real Analytic FunctionsFractional Calculus A Commutative Method on Real Analytic Functions
Fractional Calculus A Commutative Method on Real Analytic FunctionsMatt Parker
ย 
chapter-8.ppt
chapter-8.pptchapter-8.ppt
chapter-8.pptTareq Hasan
ย 
NUMERICAL INTEGRATION : ERROR FORMULA, GAUSSIAN QUADRATURE FORMULA
NUMERICAL INTEGRATION : ERROR FORMULA, GAUSSIAN QUADRATURE FORMULANUMERICAL INTEGRATION : ERROR FORMULA, GAUSSIAN QUADRATURE FORMULA
NUMERICAL INTEGRATION : ERROR FORMULA, GAUSSIAN QUADRATURE FORMULA
KHORASIYA DEVANSU
ย 
CG 2D Transformation
CG 2D TransformationCG 2D Transformation
CG 2D Transformation
MohitModyani
ย 
Triangle law of vector addition
Triangle law of vector additionTriangle law of vector addition
Triangle law of vector additionLauragibbo1
ย 
3d Projection
3d Projection3d Projection
3d Projection
Arvind Kumar
ย 
Relaxation method
Relaxation methodRelaxation method
Relaxation method
Parinda Rajapaksha
ย 
Scalars & vectors
Scalars & vectorsScalars & vectors
Scalars & vectors
KhanSaif2
ย 

What's hot (20)

It elective-4-buendia lagua
It elective-4-buendia laguaIt elective-4-buendia lagua
It elective-4-buendia lagua
ย 
Question 5 Math 1
Question 5 Math 1Question 5 Math 1
Question 5 Math 1
ย 
Question 4 Math 1
Question 4 Math 1Question 4 Math 1
Question 4 Math 1
ย 
Dijkstra's Algorithm
Dijkstra's Algorithm Dijkstra's Algorithm
Dijkstra's Algorithm
ย 
Power point vector
Power point vectorPower point vector
Power point vector
ย 
Chapter 4 Integration
Chapter 4  IntegrationChapter 4  Integration
Chapter 4 Integration
ย 
Prestation_ClydeShen
Prestation_ClydeShenPrestation_ClydeShen
Prestation_ClydeShen
ย 
2D transformation (Computer Graphics)
2D transformation (Computer Graphics)2D transformation (Computer Graphics)
2D transformation (Computer Graphics)
ย 
Shortest path search for real road networks and dynamic costs with pgRouting
Shortest path search for real road networks and dynamic costs with pgRoutingShortest path search for real road networks and dynamic costs with pgRouting
Shortest path search for real road networks and dynamic costs with pgRouting
ย 
Svm soft margin hyperplanes
Svm   soft margin hyperplanesSvm   soft margin hyperplanes
Svm soft margin hyperplanes
ย 
Parallel tansport sssqrd
Parallel tansport sssqrdParallel tansport sssqrd
Parallel tansport sssqrd
ย 
Unit 6.1
Unit 6.1Unit 6.1
Unit 6.1
ย 
Fractional Calculus A Commutative Method on Real Analytic Functions
Fractional Calculus A Commutative Method on Real Analytic FunctionsFractional Calculus A Commutative Method on Real Analytic Functions
Fractional Calculus A Commutative Method on Real Analytic Functions
ย 
chapter-8.ppt
chapter-8.pptchapter-8.ppt
chapter-8.ppt
ย 
NUMERICAL INTEGRATION : ERROR FORMULA, GAUSSIAN QUADRATURE FORMULA
NUMERICAL INTEGRATION : ERROR FORMULA, GAUSSIAN QUADRATURE FORMULANUMERICAL INTEGRATION : ERROR FORMULA, GAUSSIAN QUADRATURE FORMULA
NUMERICAL INTEGRATION : ERROR FORMULA, GAUSSIAN QUADRATURE FORMULA
ย 
CG 2D Transformation
CG 2D TransformationCG 2D Transformation
CG 2D Transformation
ย 
Triangle law of vector addition
Triangle law of vector additionTriangle law of vector addition
Triangle law of vector addition
ย 
3d Projection
3d Projection3d Projection
3d Projection
ย 
Relaxation method
Relaxation methodRelaxation method
Relaxation method
ย 
Scalars & vectors
Scalars & vectorsScalars & vectors
Scalars & vectors
ย 

Similar to 20180831 riemannian representation learning

Concepts and Applications of the Fundamental Theorem of Line Integrals.pdf
Concepts and Applications of the Fundamental Theorem of Line Integrals.pdfConcepts and Applications of the Fundamental Theorem of Line Integrals.pdf
Concepts and Applications of the Fundamental Theorem of Line Integrals.pdf
JacobBraginsky
ย 
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
mathsjournal
ย 
A Non Local Boundary Value Problem with Integral Boundary Condition
A Non Local Boundary Value Problem with Integral Boundary ConditionA Non Local Boundary Value Problem with Integral Boundary Condition
A Non Local Boundary Value Problem with Integral Boundary Condition
IJMERJOURNAL
ย 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politรจcnica de Catalunya
ย 
IIT JAM Math 2022 Question Paper | Sourav Sir's Classes
IIT JAM Math 2022 Question Paper | Sourav Sir's ClassesIIT JAM Math 2022 Question Paper | Sourav Sir's Classes
IIT JAM Math 2022 Question Paper | Sourav Sir's Classes
SOURAV DAS
ย 
Linear regression, costs & gradient descent
Linear regression, costs & gradient descentLinear regression, costs & gradient descent
Linear regression, costs & gradient descent
Revanth Kumar
ย 
Fixed Point Results for Weakly Compatible Mappings in Convex G-Metric Space
Fixed Point Results for Weakly Compatible Mappings in Convex G-Metric SpaceFixed Point Results for Weakly Compatible Mappings in Convex G-Metric Space
Fixed Point Results for Weakly Compatible Mappings in Convex G-Metric Space
inventionjournals
ย 
Complex differentiation contains analytic function.pptx
Complex differentiation contains analytic function.pptxComplex differentiation contains analytic function.pptx
Complex differentiation contains analytic function.pptx
jyotidighole2
ย 
Machine learning introduction lecture notes
Machine learning introduction lecture notesMachine learning introduction lecture notes
Machine learning introduction lecture notes
UmeshJagga1
ย 
Matrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence SpacesMatrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence Spaces
IOSR Journals
ย 
Unit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptxUnit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptx
ssuser01e301
ย 
Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Lesson 3: Problem Set 4
Lesson 3: Problem Set 4
Kevin Johnson
ย 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
VjekoslavKovac1
ย 
Design of Second Order Digital Differentiator and Integrator Using Forward Di...
Design of Second Order Digital Differentiator and Integrator Using Forward Di...Design of Second Order Digital Differentiator and Integrator Using Forward Di...
Design of Second Order Digital Differentiator and Integrator Using Forward Di...
inventionjournals
ย 
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix MappingDual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
inventionjournals
ย 
A05330107
A05330107A05330107
A05330107
IOSR-JEN
ย 
"Incremental Lossless Graph Summarization", KDD 2020
"Incremental Lossless Graph Summarization", KDD 2020"Incremental Lossless Graph Summarization", KDD 2020
"Incremental Lossless Graph Summarization", KDD 2020
์ง€ํ›ˆ ๊ณ 
ย 
Btech_II_ engineering mathematics_unit5
Btech_II_ engineering mathematics_unit5Btech_II_ engineering mathematics_unit5
Btech_II_ engineering mathematics_unit5
Rai University
ย 
CVRP solver with Multi-Head Attention
CVRP solver with Multi-Head AttentionCVRP solver with Multi-Head Attention
CVRP solver with Multi-Head Attention
Rintaro Sato
ย 

Similar to 20180831 riemannian representation learning (20)

Icra 17
Icra 17Icra 17
Icra 17
ย 
Concepts and Applications of the Fundamental Theorem of Line Integrals.pdf
Concepts and Applications of the Fundamental Theorem of Line Integrals.pdfConcepts and Applications of the Fundamental Theorem of Line Integrals.pdf
Concepts and Applications of the Fundamental Theorem of Line Integrals.pdf
ย 
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
ย 
A Non Local Boundary Value Problem with Integral Boundary Condition
A Non Local Boundary Value Problem with Integral Boundary ConditionA Non Local Boundary Value Problem with Integral Boundary Condition
A Non Local Boundary Value Problem with Integral Boundary Condition
ย 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
ย 
IIT JAM Math 2022 Question Paper | Sourav Sir's Classes
IIT JAM Math 2022 Question Paper | Sourav Sir's ClassesIIT JAM Math 2022 Question Paper | Sourav Sir's Classes
IIT JAM Math 2022 Question Paper | Sourav Sir's Classes
ย 
Linear regression, costs & gradient descent
Linear regression, costs & gradient descentLinear regression, costs & gradient descent
Linear regression, costs & gradient descent
ย 
Fixed Point Results for Weakly Compatible Mappings in Convex G-Metric Space
Fixed Point Results for Weakly Compatible Mappings in Convex G-Metric SpaceFixed Point Results for Weakly Compatible Mappings in Convex G-Metric Space
Fixed Point Results for Weakly Compatible Mappings in Convex G-Metric Space
ย 
Complex differentiation contains analytic function.pptx
Complex differentiation contains analytic function.pptxComplex differentiation contains analytic function.pptx
Complex differentiation contains analytic function.pptx
ย 
Machine learning introduction lecture notes
Machine learning introduction lecture notesMachine learning introduction lecture notes
Machine learning introduction lecture notes
ย 
Matrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence SpacesMatrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence Spaces
ย 
Unit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptxUnit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptx
ย 
Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Lesson 3: Problem Set 4
Lesson 3: Problem Set 4
ย 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
ย 
Design of Second Order Digital Differentiator and Integrator Using Forward Di...
Design of Second Order Digital Differentiator and Integrator Using Forward Di...Design of Second Order Digital Differentiator and Integrator Using Forward Di...
Design of Second Order Digital Differentiator and Integrator Using Forward Di...
ย 
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix MappingDual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
ย 
A05330107
A05330107A05330107
A05330107
ย 
"Incremental Lossless Graph Summarization", KDD 2020
"Incremental Lossless Graph Summarization", KDD 2020"Incremental Lossless Graph Summarization", KDD 2020
"Incremental Lossless Graph Summarization", KDD 2020
ย 
Btech_II_ engineering mathematics_unit5
Btech_II_ engineering mathematics_unit5Btech_II_ engineering mathematics_unit5
Btech_II_ engineering mathematics_unit5
ย 
CVRP solver with Multi-Head Attention
CVRP solver with Multi-Head AttentionCVRP solver with Multi-Head Attention
CVRP solver with Multi-Head Attention
ย 

Recently uploaded

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
ย 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
ย 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
ย 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
ย 
ไธ€ๆฏ”ไธ€ๅŽŸ็‰ˆ(SFUๆฏ•ไธš่ฏ)่ฅฟ่’™่ฒ่ŽŽๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ๅฆ‚ไฝ•ๅŠž็†
ไธ€ๆฏ”ไธ€ๅŽŸ็‰ˆ(SFUๆฏ•ไธš่ฏ)่ฅฟ่’™่ฒ่ŽŽๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ๅฆ‚ไฝ•ๅŠž็†ไธ€ๆฏ”ไธ€ๅŽŸ็‰ˆ(SFUๆฏ•ไธš่ฏ)่ฅฟ่’™่ฒ่ŽŽๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ๅฆ‚ไฝ•ๅŠž็†
ไธ€ๆฏ”ไธ€ๅŽŸ็‰ˆ(SFUๆฏ•ไธš่ฏ)่ฅฟ่’™่ฒ่ŽŽๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ๅฆ‚ไฝ•ๅŠž็†
bakpo1
ย 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
ย 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
ย 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
ย 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
ย 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
ย 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
ย 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
ย 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
ย 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
ย 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
ย 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
ย 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
ย 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
SupreethSP4
ย 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
ย 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
ย 

Recently uploaded (20)

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
ย 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
ย 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
ย 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
ย 
ไธ€ๆฏ”ไธ€ๅŽŸ็‰ˆ(SFUๆฏ•ไธš่ฏ)่ฅฟ่’™่ฒ่ŽŽๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ๅฆ‚ไฝ•ๅŠž็†
ไธ€ๆฏ”ไธ€ๅŽŸ็‰ˆ(SFUๆฏ•ไธš่ฏ)่ฅฟ่’™่ฒ่ŽŽๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ๅฆ‚ไฝ•ๅŠž็†ไธ€ๆฏ”ไธ€ๅŽŸ็‰ˆ(SFUๆฏ•ไธš่ฏ)่ฅฟ่’™่ฒ่ŽŽๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ๅฆ‚ไฝ•ๅŠž็†
ไธ€ๆฏ”ไธ€ๅŽŸ็‰ˆ(SFUๆฏ•ไธš่ฏ)่ฅฟ่’™่ฒ่ŽŽๅคงๅญฆๆฏ•ไธš่ฏๆˆ็ปฉๅ•ๅฆ‚ไฝ•ๅŠž็†
ย 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
ย 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
ย 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
ย 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
ย 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
ย 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
ย 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
ย 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
ย 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
ย 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
ย 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
ย 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
ย 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
ย 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
ย 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
ย 

20180831 riemannian representation learning

  • 1. Hierarchical representation with hyperbolic geometry 2016-20873 Segwang Kim 1
  • 2. โ‘  Embedding Symbolic and Hierarchical Data โ‘ก Introduction to Hyperbolic Space โ‘ข Optimization over Hyperbolic Space โ‘ฃ Toy Experiments Overview 2
  • 3. 3 Embedding Symbolic and Hierarchical Data
  • 4. Symbolic and Hierarchical Data 4 Symbolic data with Implicit hierarchy. Downstream tasks link prediction, node classification, community detection, visualization Wordnet Twitter Social Graph ?LINK community
  • 5. Good Hierarchical Embedding 5 For downstream tasks, symbolic and hierarchical data needs to be embedded into space. Good Embedding? Embeddings of similar symbols should aggregate in some sense. Symbolic arithmetic exists: v(King)- v(man) + v(woman)=v(Queen) Hierarchy can be restored from embedded data. The space should have low dimension.
  • 7. Limitation of Euclidean Embedding 7 Embed graph structure while preserving distances Thm) Trees cannot be embedded into Euclidean space with arbitrarily low distortion for any number of dimensions a b Graph Euclidean ?? D(a,b) 2 0.1 1.889 D(a,c) 2 1 1.902 D(a,d) 2 1.8 1.962 Euclidean Graph ?? c d a b c d a b c d Embedding Representation tradeoffs for hyperbolic Embeddings (ICML 2018)
  • 8. Euclidean Space vs Hyperbolic space 8 ๐‘€ = ๐ท ๐‘› = {๐‘ฅ โˆˆ โ„ ๐‘› โˆถ ๐‘ฅ1 2 + โ‹ฏ + ๐‘ฅ ๐‘› 2 < 1} (๐ท ๐‘› , 2 1โˆ’||๐‘ฅ||2 2 ๐‘”)๐‘” = ๐‘‘๐‘ฅ1 2 + โ‹ฏ + ๐‘‘๐‘ฅ ๐‘› 2 Euclidean Hyperbolic (โ„ ๐‘›, ๐‘”) ๐‘€ = โ„ ๐‘› Metric tensor : inner product on tangent space = ๐‘‘๐‘ฅ1 ๐‘ข ๐‘‘๐‘ฅ1 ๐‘ฃ + โ‹ฏ + ๐‘‘๐‘ฅ ๐‘› ๐‘ข ๐‘‘๐‘ฅ ๐‘›(๐‘ฃ) = ๐‘ข1 ๐‘ฃ1 + โ‹ฏ + ๐‘ข ๐‘› ๐‘ฃ ๐‘› โˆ€ ๐‘ข, ๐‘ฃ โˆˆ ๐‘‡๐‘โ„ ๐‘› where ๐‘ โˆˆ โ„ ๐‘› ๐‘ข, ๐‘ฃ ๐‘ = ๐‘ข ๐‘ก ๐‘”๐‘ฃ = 2 1 โˆ’ ||๐‘||2 2 (๐‘ข1 ๐‘ฃ1 + โ‹ฏ + ๐‘ข ๐‘› ๐‘ฃ ๐‘›) โˆ€ ๐‘ข, ๐‘ฃ โˆˆ ๐‘‡๐‘ ๐ท ๐‘› where ๐‘ โˆˆ ๐ท ๐‘› ๐‘ข, ๐‘ฃ ๐‘ = ๐‘ข ๐‘ก ( 2 1 โˆ’ ||๐‘||2 2 ๐‘”)๐‘ฃ Give Riemannian Metric
  • 9. Euclidean Space vs Hyperbolic space 9 Inner product โŸจ โ‹… , โ‹… โŸฉ ๐‘ in ๐‘‡๐‘ ๐ท ๐‘› defines Length of ๐›พ: 0,1 โ†’ ๐ท ๐‘› ๏ƒ  ๐ฟ ๐›พ = 0 1 ๐›พ๐‘ก โ€ฒ , ๐›พ๐‘ก โ€ฒ ๐›พ๐‘ก 1/2 ๐‘‘๐‘ก Angle between ๐‘ค1, ๐‘ค2 โˆˆ ๐‘‡๐‘ ๐ท ๐‘› ๏ƒ  ๐‘ค1,๐‘ค2 ๐‘ ๐‘ค1,๐‘ค1 ๐‘โ‹… ๐‘ค2,๐‘ค2 ๐‘ 1/2 Line between ๐‘, ๐‘ž โˆˆ ๐‘€ is the shortest path between them ๐›พโˆ— = ๐‘Ž๐‘Ÿ๐‘”๐‘š๐‘–๐‘› 0 1 ๐›พ๐‘ก โ€ฒ , ๐›พ๐‘ก โ€ฒ ๐›พ๐‘ก 1/2 ๐‘‘๐‘ก ๐›พ0 = ๐‘, ๐›พ1 = ๐‘ž Euclidean Hyperbolic ๐‘ž ๐‘ ๐‘ž ๐‘ 2 1 โˆ’ ||๐‘ฅ||2 2 ๐‘” โ†’ โˆž ๐‘Ž๐‘  |๐‘ฅ| โ†’ 1
  • 10. Equivalent Hyperbolic Models 10 We can choose one of Hyperbolic Models depending on purpose. ๐ท ๐‘› = {๐‘ฅ โˆˆ โ„ ๐‘› โˆถ ๐‘ฅ1 2 + โ‹ฏ + ๐‘ฅ ๐‘› 2 < 1} (๐ท ๐‘›, 2 1โˆ’||๐‘ฅ||2 2 ๐‘‘๐‘ฅ1 2 + โ‹ฏ + ๐‘‘๐‘ฅ ๐‘› 2) (๐‘ฅ0, โ€ฆ , ๐‘ฅ ๐‘›) ๏ƒ  For visualization ๏ƒ  For optimization ( ๐‘ฅ1 1 + ๐‘ฅ0 , โ€ฆ , ๐‘ฅ ๐‘› 1 + ๐‘ฅ0 ) Poincare Model Lorentz Model (โ„’ ๐‘› , โˆ’๐‘‘๐‘ฅ0 2 + ๐‘‘๐‘ฅ1 2 โ€ฆ + ๐‘‘๐‘ฅ ๐‘› 2 ) ISOMETRIC Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry (ICML 2018)
  • 12. Suggested loss function 12 A Example of loss function over hyperbolic space. Fundamentally, gradients of loss tells which direction the points should proceed. Poincarรฉ Embeddings for Learning Hierarchical Representations (ICML 2017)
  • 13. Gradient Descent Algorithm 13 Input: ๐‘“: ๐ฟ2 โ†’ โ„, ๐‘0 โˆˆ ๐ฟ2, ๐‘˜ = 0 repeat choose a descent direction ๐‘ฃ ๐‘˜ โˆˆ ๐‘‡๐‘ ๐‘˜ ๐ฟ2 choose a retraction ๐‘… ๐‘ ๐‘˜ : ๐‘‡๐‘ ๐‘˜ ๐ฟ2 โ†’ ๐ฟ2 choose a step length ๐›ผ ๐‘˜ โˆˆ โ„ set ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜ (๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜) ๐‘˜ โ† ๐‘˜ + 1 until ๐‘ ๐‘˜+1 sufficiently minimize ๐‘“ Nothing different from usual gradient descent except for Gradient direction Retraction Optimization methods on Riemannian manifolds and their application to shape space (SIAM 2012)
  • 14. Gradient Descent Algorithm 14 Input: ๐‘“: ๐ฟ2 โ†’ โ„, ๐‘0 โˆˆ ๐ฟ2, ๐‘˜ = 0 repeat choose a descent direction ๐‘ฃ ๐‘˜ โˆˆ ๐‘‡๐‘ ๐‘˜ ๐ฟ2 choose a retraction ๐‘… ๐‘ ๐‘˜ : ๐‘‡๐‘ ๐‘˜ ๐ฟ2 โ†’ ๐ฟ2 choose a step length ๐›ผ ๐‘˜ โˆˆ โ„ set ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜ (๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜) ๐‘˜ โ† ๐‘˜ + 1 until ๐‘ ๐‘˜+1 sufficiently minimize ๐‘“ What is the gradient on Hyperbolic space? ๐‘“ โˆถ (โ„’2 , โˆ’๐‘‘๐‘ฅ0 2 + ๐‘‘๐‘ฅ1 2 + ๐‘‘๐‘ฅ ๐‘› 2 ) โ†’ โ„ โˆ‡๐‘“ ?
  • 15. Hyperboloid model 15 First, find ๐›ปโ„2:1 ๐‘“| ๐‘ โˆˆ โ„3 ๐‘ . ๐‘ก. ๐›ปโ„2:1 ๐‘“| ๐‘, ๐‘ฃ โ„’ = ๐‘‘๐‘“ ๐‘ฃ | ๐‘. Second, project ๐›ปโ„2:1 ๐‘“| ๐‘ into ๐‘‡๐‘ ๐ฟ2. ๐›ป๐ฟ2 ๐‘“| ๐‘ = ๐›ปโ„2:1 ๐‘“| ๐‘ + ๐›ปโ„2:1 ๐‘“| ๐‘, ๐‘ โ„’ ๐‘ ๐‘‡๐‘ ๐ฟ2 = {๐‘ฃ โˆˆ โ„3 โˆถ ๐‘ฃ, ๐‘ โ„’ = 0}. ๐ฟ2 = {๐‘ โˆˆ โ„3: ๐‘, ๐‘ โ„’ = โˆ’1, ๐‘ ๐‘ง > 0}. ๐‘“ โˆถ (โ„’2, โˆ’๐‘‘๐‘ฅ0 2 + ๐‘‘๐‘ฅ1 2 + ๐‘‘๐‘ฅ2 2) โ†’ โ„ ๐›ปโ„2:1 ๐‘“| ๐‘ = (โˆ’๐‘‘๐‘ฅ0 2 + ๐‘‘๐‘ฅ1 2 + ๐‘‘๐‘ฅ ๐‘› 2)โˆ’1 โ‹… Usual derivative (from tensorflow) โˆ’๐‘ฃ ๐‘˜ Gradient descent in hyperbolic space (Arxiv 2018)
  • 16. Gradient Descent Algorithm 16 Input: ๐‘“: ๐ฟ2 โ†’ โ„, ๐‘0 โˆˆ ๐ฟ2, ๐‘˜ = 0 repeat choose a descent direction ๐‘ฃ ๐‘˜ โˆˆ ๐‘‡๐‘ ๐‘˜ ๐ฟ2 choose a retraction ๐‘… ๐‘ ๐‘˜ : ๐‘‡๐‘ ๐‘˜ ๐ฟ2 โ†’ ๐ฟ2 choose a step length ๐›ผ ๐‘˜ โˆˆ โ„ set ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜ (๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜) ๐‘˜ โ† ๐‘˜ + 1 until ๐‘ ๐‘˜+1 sufficiently minimize ๐‘“ What is the retraction on Hyperbolic space?
  • 17. Hyperboloid model 17 Retraction tells how ends points of tangent vectors correspond to the point on manifold. We chose affine geodesic as retraction ๐›พ๐‘ก = cosh ||๐‘ฃ||โ„’ ๐‘ก ๐‘ + sinh ||๐‘ฃ||โ„’ ๐‘ก ๐‘ฃ ||๐‘ฃ||โ„’ ๐‘žโ€ฒ โˆ‰ ๐ฟ2 ๐‘…(๐‘žโ€ฒ ) โˆˆ ๐ฟ2 At ๐‘ โˆˆ ๐ฟ2 with direction ๐‘ฃ โˆˆ ๐‘‡๐‘ ๐ฟ2
  • 18. Gradient Descent Algorithm 18 Input: ๐‘“: ๐ฟ2 โ†’ โ„, ๐‘0 โˆˆ ๐ฟ2, ๐‘˜ = 0 repeat choose a descent direction ๐‘ฃ ๐‘˜ โˆˆ ๐‘‡๐‘ ๐‘˜ ๐ฟ2 choose a retraction ๐‘… ๐‘ ๐‘˜ : ๐‘‡๐‘ ๐‘˜ ๐ฟ2 โ†’ ๐ฟ2 choose a step length ๐›ผ ๐‘˜ โˆˆ โ„ set ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜ (๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜) ๐‘˜ โ† ๐‘˜ + 1 until ๐‘ ๐‘˜+1 sufficiently minimize ๐‘“ The next point becomes ๐‘ ๐‘˜+1 = ๐‘… ๐‘ ๐‘˜ (๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜) = cosh ||๐‘ฃ ๐‘˜||โ„’ ๐›ผ ๐‘˜ ๐‘ ๐‘˜ + sinh ||๐‘ฃ ๐‘˜||โ„’ ๐›ผ ๐‘˜ ๐‘ฃ ๐‘˜ ||๐‘ฃ ๐‘˜||โ„’
  • 19. Simple Optimization Task1 19 GD with gradients GD with R-gradients R-GD with R-gradients ๐‘๐‘ก = ๐‘๐‘กโˆ’1 โˆ’ ๐›ผ โ‹… ๐›ป๐ธ ๐ฟ(๐‘๐‘กโˆ’1) ๐‘๐‘ก = ๐‘๐‘กโˆ’1 โˆ’ ๐›ผ โ‹… ๐›ป๐‘… ๐ฟ(๐‘๐‘กโˆ’1) ๐‘๐‘ก = ๐›พ ๐›ผ ๐›พ0 = ๐‘๐‘กโˆ’1 ๐›พ0 โ€ฒ = ๐›ป๐‘… ๐ฟ(๐‘๐‘กโˆ’1) 3.3024998, 4.7424998, 4.7859879, 4.8213577, 4.851644, 4.8784704, 4.9028177, 4.9253302 3.3024998, 3.3081245, 3.3175893, 3.3334663, 3.3599658, 3.403821, 3.4753809, 3.5894651 3.3024998, 3.3025002, 3.3025002, 3.3025002, 3.3025005, 3.3025, 3.3025002, 3.3025005
  • 20. Simple Optimization Task2 20 ๐ฟ(๐‘) = ๐‘– ๐‘‘ ๐ฟ2 ๐‘, ๐‘ฅ๐‘– 2 โ€œBarycenterโ€ can be found by minimizing
  • 22. Simple Optimization Task2 22 ๐ฟ(๐‘) = ๐‘– ๐‘‘ ๐ฟ2 ๐‘, ๐‘ฅ๐‘– 2 โ€œBarycenterโ€ can be found by minimizing
  • 23. Takeaways 23 Hyperbolic space is promising to represent symbolic and hierarchical datasets. Geometry determines path toward optimal points. Regardless of optimization technique, the optimal point is only depends on loss function. Interpretation: Can the path entail semantics? Loss function over hyperbolic space should be discreetly chosen. Is it suitable for given geometry? Differentiable? / operation? Unfortunately, we loose simple arithmetic.

Editor's Notes

  1. Good evening. I am segwang kim from machine intelligence lab. My topic is Hierarchical representation with hyperbolic geometry. This topic is the topic I am currently working on, but I have gotten nothing meaningful yet. I found this topic intriguing in that it suggests alternative ways to represent symbolic and hierarchical datasets, which in turns helps to do downstream tasks in Natural Language Processing or Social Network Analysis.
  2. This is an overview. The main goal of this talk is to make you get along with hyperbolic representation. First, I will introduce the data of interest to be represented and conventional way to embed those datasets. Second, I will go over shortcomings of conventional embedding and introduce the gist of hyperbolic space. Third, I am gonna show optimization technique over hyperbolic space. In the end, Toy Experiments are followed. Recent papers are included in this presentation.
  3. The datasets I am dealing with, such as wordnet or social network are symbolic and hierarchical. They are symbolic because words or users have no meaningful numeric values. They are just symbols. On top of that, they are hierarchical since there exist partial orderings between data points like dogs belong to mammals and mammals belong to animal. Or, when a twitter user follows another, then we can have ordering between them. The typical machine learning problem on those datasets are link prediction, node classification, community detection or visualization. To be specific, someone would ask are sprinkler and birdcage linked? or what community does a particular user belong to?
  4. To tackle those problems, we need to parametrize symbolic and hierarchical dataset into numeric forms. We call this process as embedding. Once data points are embedded into some space, we can apply a machine learning model that work on the space. Even if symbolic datapoints are represented in numerical form, it is natural to expect that the embedding should agree on our intuition. For instance, two words with similar meaning should be represented as two points that are close to each other. This two-dimensional figure seems to catch semantic relation. Like this, we expect some properties from good embedding. Down the ages, we have embedded symbolic data into the most familiar space, Euclidean space.
  5. However, there are some limitations of Euclidean Embedding. To illustrate, assume that we want to solve machine learning problem on this bushy-structured datasets. Edge between two nodes means they have something in common. Therefore, we would want to find the embedding that preserves distances among nodes measured in the graph. Unfortunately, a second you embed the data points into two dimensional Euclidean space, you would realize that the huge distortions have been made. While the graph distance between node a and b is 2, the Euclidean distance between corresponding points is far less than 2. To remedy this problem, researchers have increased the dimensionality of Euclidean space. However, by doing that, we loose opportunities to analyze it low dimension. On top of that, trying to embed trees into Euclidean space is wrong from the beginning. To be more formally, there is a theorem that Trees cannot be โ€ฆ. So, main question is, what if we have a space that preserve graph structure well like this one? What is this mysterious space? Now, itโ€™s time to introduce hyperbolic space.
  6. Time for series of math slides. The best analogy I can use for introducing hyperbolic space is Euclidean space. We can define geometry of given space or manifold by looking into its domain and inner product structure on tangent space. Before elaborating why inner product structure does matter, letโ€™s formally define hyperbolic space. Hyperbolic space is a manifold with constant sectional curvature -1 and five different models are used for describing it. Actually they are same because there exists isometries among them. Anyhow, I pick one of them. A Poicare disk model. The domain of N-dimensional poicare disk model is N-dimensional sphere. A innerproduct of tangent space is defined like this. Unlike Euclidean space which has the same innerprdocut rule for all tangent space, hyperbolic space has different innerproduct structure depending on which point given tangent space is attached. In mathetmatical term, this is called Riemannian metric. To compare these two spaces, letโ€™s do an inner product. First you attach tangent plane to given point p in Euclidean or hyperbolic space and then, you pick two arbitrary tangent vectors from the tangent plane. In case of Euclidean product, you take component-wise product and do summation. Note that the point p has nothing to do with computing inner product. However, in case of hyperbolic space, this highlighted term is multiplied after usual inner product. Note that it depends on point p. Because of this term, strange things are happened.
  7. As I said, inner product of tangent space governs geometry of space. Because, it defines length, angle and โ€œlineโ€ of given space. From the calculus 101, we know that length of given path is defined as line integral of norm of instantaneous velocity, which is tangent vector. Since norm is defined when inner product is given, the Riemannian Metric comes into play. Also, angle between two tangent vectors is governed by innerproduct structure because inner products need to be done. Finally, if we keep in mind that line is not defined as straight path but the shortest path connecting starting and end points, shape of line in hyperbolic space must be different. The shortest path is the optimal solution of this functional equation which seems almost impossible to solve. But Mathematician concludes that line in hyperbolic space is either an usual arc which perpendicularly intersects with boundary of n-dimensional sphere or straight line starting from the center. Considering the norm of tangent vector increases as base point goes to boundary, the shortest path must be inclined to pass region around center rather than near boundary. So it must be tilted toward center.
  8. One interesting fact about hyperbolic space is we can choose a one model among five ones depending on situation. Fundamentally, they are all same because of existence of isometry. The paper โ€œ โ€œ suggest that Poincare ball model is more adequate for visualization than Lorentz model, defined like this. This is because Lorentz model is defined on ambient space with constraints. But Lorentz model guarantees more computational stability of gradient than Poincare ball model. In the following optimization section, I will explain optimization technique on Lorentz model not Poincare model.
  9. This is one example of loss function over hyperbolic space. As you can see, this loss function has hyperbolic distance terms. Details are omitted, but basically, this disperses irrelevant datapoints and aggregates relevant ones. Because gradients of loss tells which direction the datapoints should proceed, we need to know how to compute derivative of given loss function.
  10. This is Riemannian Gradient descent algorithm. There are only two parts you need to focus on. First, choosing a descent direction, second choosing a retraction
  11. Choosing a descent direction needs more a little bit of efforts than usual gradient. Letโ€™s assume that we want to minimize a loss function over two-dimensional Lorentz model. Basically, we want to find gradient of f.
  12. It takes two steps. Basically, we need to correspond naรฏve gradients to a tangent vector. First, once we get a gradient from tensorflow or any api, as shown in blue box, this value is unique no matter which metric tensor you have chosen. If we interpret gradient as linear mapping from tangent space to real number, Riesz representation theorem implies that there is a corresponding vector such that inner product with the vector is the gradient map. To find the vector, inverse of metric tensor needs to be multiplied to usual derivatives in order to compensates extra terms in hyperbolic innerproduct. It is complicated but, bottom line is just flip the sign of the first element of usual gradient. The second step is projection. Because Lorentz model is defined in ambient space, we need to project the resulting vector from the first step to tangent place of model. It only takes some multiplication and addition. Therefore, we can get Riemannian descent direction by flipping signs of all components of hyperbolic gradient of the loss.
  13. Retraction tells how can a point be moved to given direction. When the point is moved to the tip of the direction, it escapes a manifold. This is sad.
  14. However, the point is moved to the tip of the geodesics, then it stays on the manifold and we are happy. The geodesic is a hyperbolic version of line and this simple formula is all you need.
  15. The last step is trivial. We just need to iterate previous steps until we get sufficiently small errors.