Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A Brief Introduction to Manifold Learning
Wei Yang
platero.yang@gmail.com
2016/8/11 1
Some slides are from Geometric Metho...
What is a manifold?
2016/8/11 2
https://en.wikipedia.org/wiki/Manifold
Manifolds in visual perception
Consider a simple example of image variability, the set 𝑀 of all
facial images generated by...
Why Manifolds?
• Euclidean distance in the high dimensional input space may
not accurately reflect the intrinsic similarit...
Differential Geometry
• Embedded manifolds
• Tangent space
• Geodesic
• Laplace-Beltrami operator
2016/8/11 5
Embedded manifolds
• ℳ 𝑘
⊂ ℝ 𝑁
• Locally (not globally) looks like Euclidean space
𝑆 2
⊂ ℝ3
2016/8/11 6
Example: circle
• 𝑥2
+ 𝑦2
= 1
• Charts: continuous, invertible
𝜙 𝑡𝑜𝑝: (𝑥, 𝑦) ⟼ 𝑥
𝜙 𝑡𝑜𝑝
−1
: (𝑥) ⟼ (𝑥, 1 − 𝑥2)
• Atlas: cha...
Tangent space
• 𝑘-dimensional affine subspace of ℝ 𝑁
𝑇𝑝ℳ 𝑘
⊂ ℝ 𝑁
𝑝
2016/8/11 8
Tangent vectors and curves
• Tangent vectors <———> curves.
Geometric Methods and Manifold Learning – p. 1
𝜙 𝑡 : ℝ → ℳ 𝑘
𝑑 ...
Tangent vectors as derivatives
• Tangent vectors <———> Directional derivatives
Geometric Methods and Manifold Learning – p...
Riemannian geometry
• Norms and angles in tangent space
Geometric Methods and Manifold Learning – p. 1
< 𝑣, 𝑤 >
𝑣 , 𝑤
𝑣
𝑤𝑝...
Length of curves and geodesics
• Can measure length using norm in tangent space.
• Geodesic — shortest curve between two p...
Gradient
• Tangent vectors <———> Directional derivatives
• Gradient points in the direction of maximum change.
Geometric M...
Exponential map
• Geodesic: 𝜙 𝑡
• 𝜙 0 = 𝑝, 𝜙 𝑣 = 𝑞,
𝑑𝜙 𝑡
𝑑𝑡
|0 = 𝑣
Geometric Methods and Manifold Learning – p. 1
exp 𝑝: 𝑇...
Laplace-Beltrami operator
• Orthonormal coordinate system.
Geometric Methods and Manifold Learning – p. 1
𝑝 𝑥2
𝑥1
𝑓: ℳ 𝑘
→...
Linear Manifold Learning
• Principal Components Analysis
• Multidimensional Scaling
2016/8/11 16
Principal Components Analysis
• Given x1, x2… , x 𝑛 ∈ ℝ 𝐷 with mean 0
• Find 𝑦1, 𝑦2 … , 𝑦𝑛 ∈ ℝ such that
𝑦𝑖 = w ∙ x 𝑖
• An...
Multidimensional Scaling
• MDS: exploring similarities or dissimilarities in data.
• Given 𝑁 data points with distance fun...
Nonlinear Manifold Learning
• ISOMAP (Tenenbaum, et al, 00)
• LLE (Roweis, Saul, 00)
• Laplacian Eigenmaps (Belkin, Niyogi...
Algorithmic framework
• Neighborhood graph common to all methods.
2016/8/11 20
Isomap: Motivation
• PCA/MDS see just the Euclidean structure
• Only geodesic distances reflect the true low-dimensional
g...
Isomap
1. Construct neighborhood graph 𝒅 𝒙(𝒊, 𝒋) using Euclidean
distance
– 𝜖-Isomap: neighbors within a radius 𝜖
– 𝐾-Isom...
Isomap: results
2016/8/11 23
Face varying in pose and illumination Hand varying in finger extension & wrist
rotation
Isomap: estimate the intrinsic dimensionality
2016/8/11 24
Locally Linear Embedding
• Intuition: each data point and its neighbors are expected to lie
on or close to a locally linea...
Locally Linear Embedding
1. Assign neighbours to
each data point (𝑘-NN)
2. Reconstruct each
point by a weighted
linear com...
Steps of locally linear embedding
• Suppose we have 𝑁 data points 𝑋𝑖 in a 𝐷 dimensional space.
• Step 1: Construct neighbo...
Steps of locally linear embedding
• Step 2: Compute the weights 𝑊𝑖𝑗 that best linearly
reconstruct 𝑋𝑖 from its neighbors b...
Steps of locally linear embedding
• Step 3: Compute the low-dimensional embedding best
reconstructed by 𝑊𝑖𝑗 by minimizing
...
LLE (Comments by Ruimao Zhang)
算法优点
• LLE算法可以学习任意维的局部线性
的低维流形.
• LLE算法中的待定参数很少, K 和 d.
• LLE算法中每个点的近邻权值在平
移, 旋转,伸缩变换下是保持不变...
Laplacian Eigenmaps
• Using the notion of the Laplacian of a graph to compute a
low-dimensional representation of the data...
Laplacian matrix (discrete Laplacian)
• Laplacian matrix is a matrix representation of a graph
𝐿 = 𝐷 − 𝐴
– 𝐿 is the Laplac...
Laplacian Eigenmaps
2016/8/11 mlss09us_niyogi_belkin_gmml 33
Laplacian Eigenmaps
2016/8/11 mlss09us_niyogi_belkin_gmml 34
Laplacian Eigenmaps
2016/8/11 mlss09us_niyogi_belkin_gmml 35
𝑫: Degree matrix
𝑾 : (Weighted) adjacent matrix
𝑳 : Laplacian...
Justification of optimal embedding
• We have constructed a weighted graph 𝐺 = 𝑉, 𝐸
• We want to map 𝐺 to a line 𝒚 so that ...
Justification of optimal embedding (Cont.)
𝑖𝑗
𝑦𝑖 − 𝑦𝑗
2
𝑊𝑖𝑗
=
𝑖𝑗
𝑦𝑖
2 + 𝑦𝑗
2 − 2𝑦𝑖 𝑦𝑗 𝑊𝑖𝑗
= (
𝑖
𝑦𝑖
2
𝐷𝑖𝑖 +
𝑗
𝑦𝑗
2
𝐷𝑗𝑗) − 2...
Justification of optimal embedding (Cont.)
The minimization problem reduces to finding
Note the constraint removes an arbi...
More methods for non-linear manifold learning
2016/8/11 39
Applications
• Super-resolution
• Laplacianfaces
2016/8/11 40
Super-Resolution Through Neighbor Embedding
• Intuition: small patches in the low- and high-resolution images
form manifol...
Super-Resolution Through Neighbor Embedding
• Training parameters
– The number of nearest neighbors K
– The patch size
– T...
Super-Resolution Through Neighbor Embedding
2016/8/11 43
Chang, Hong, Dit-Yan Yeung, and Yimin Xiong. "Super-resolution
th...
Laplacianfaces
• Mapping face images in the image space via Locality
Preserving Projections (LPP) to low-dimensional face
...
Laplacianfaces
Learning Laplacianfaces for Representation
1. PCA projection (kept 98 percent information in the sense of
r...
Laplacianfaces
Two-dimensional linear embedding of face images by Laplacianfaces.
2016/8/11 46
Reference and Resources
• 浅谈流形学习. Pluskid. 2010-05-29. http://blog.pluskid.org/?p=533&cpage=1
• Wikipedia: MDS, Manifold, ...
Thank you.
2016/8/11 48
Upcoming SlideShare
Loading in …5
×

Manifold learning

2,445 views

Published on

Review of Manifold learning

Published in: Science
  • Be the first to comment

Manifold learning

  1. 1. A Brief Introduction to Manifold Learning Wei Yang platero.yang@gmail.com 2016/8/11 1 Some slides are from Geometric Methods and Manifold Learning in Machine Learning (Mikhail Belkin and Partha Niyoqi). Summer School (MLSS), Chicago 2009
  2. 2. What is a manifold? 2016/8/11 2 https://en.wikipedia.org/wiki/Manifold
  3. 3. Manifolds in visual perception Consider a simple example of image variability, the set 𝑀 of all facial images generated by varying the orientation of a face • 1D manifold: – a single degree of freedom: the angle of rotation • The dimensionality of 𝑀 would increase if we allow – image scaling – illumination changing – … 2016/8/11 3 The Manifold Ways of Perception •H. Sebastian Seung and •Daniel D. Lee Science 22 December 2000
  4. 4. Why Manifolds? • Euclidean distance in the high dimensional input space may not accurately reflect the intrinsic similarity – Euclidean distance – Geodesic distance 2016/8/11 4 Linear Manifold VS. Nonlinear Manifold
  5. 5. Differential Geometry • Embedded manifolds • Tangent space • Geodesic • Laplace-Beltrami operator 2016/8/11 5
  6. 6. Embedded manifolds • ℳ 𝑘 ⊂ ℝ 𝑁 • Locally (not globally) looks like Euclidean space 𝑆 2 ⊂ ℝ3 2016/8/11 6
  7. 7. Example: circle • 𝑥2 + 𝑦2 = 1 • Charts: continuous, invertible 𝜙 𝑡𝑜𝑝: (𝑥, 𝑦) ⟼ 𝑥 𝜙 𝑡𝑜𝑝 −1 : (𝑥) ⟼ (𝑥, 1 − 𝑥2) • Atlas: charts covered the whole circle • Transition map 𝑇 𝑎 = 𝜙 𝑟𝑖𝑔ℎ𝑡 𝜙 𝑡𝑜𝑝 −1 𝑎 2016/8/11 7 𝑥 𝑦 = 𝜙 𝑟𝑖𝑔ℎ𝑡(𝑎, 1 − 𝑎2) = 1 − 𝑎2 𝑎 𝜙 𝑡𝑜𝑝 −1 𝑎 𝜙 𝑟𝑖𝑔ℎ𝑡 𝜙 𝑡𝑜𝑝 −1 𝑎 http://en.wikipedia.org/wiki/Manifold
  8. 8. Tangent space • 𝑘-dimensional affine subspace of ℝ 𝑁 𝑇𝑝ℳ 𝑘 ⊂ ℝ 𝑁 𝑝 2016/8/11 8
  9. 9. Tangent vectors and curves • Tangent vectors <———> curves. Geometric Methods and Manifold Learning – p. 1 𝜙 𝑡 : ℝ → ℳ 𝑘 𝑑 𝜙 𝑡 𝑑 𝑡 |0 = 𝑉 𝑣 𝜙 𝑡 𝑝 2016/8/11 9
  10. 10. Tangent vectors as derivatives • Tangent vectors <———> Directional derivatives Geometric Methods and Manifold Learning – p. 1 𝜙 𝑡 : ℝ → ℳ 𝑘 𝑓 𝜙 𝑡 : ℝ → ℝ 𝑑𝑓 𝑑𝑣 = 𝑑 𝜙 𝑡 𝑑 𝑡 |0 𝑣 𝜙 𝑡 𝑓: ℳ 𝑘 → ℝ𝑝 2016/8/11 10
  11. 11. Riemannian geometry • Norms and angles in tangent space Geometric Methods and Manifold Learning – p. 1 < 𝑣, 𝑤 > 𝑣 , 𝑤 𝑣 𝑤𝑝 2016/8/11 11
  12. 12. Length of curves and geodesics • Can measure length using norm in tangent space. • Geodesic — shortest curve between two points. Geometric Methods and Manifold Learning – p. 1 𝜙 𝑡 : [0,1] → ℳ 𝑘 𝑙 𝜙 = 0 1 𝑑𝜙 𝑑𝑡 𝑑𝑡 𝑝 2016/8/11 12
  13. 13. Gradient • Tangent vectors <———> Directional derivatives • Gradient points in the direction of maximum change. Geometric Methods and Manifold Learning – p. 1 < 𝛻𝑓, 𝑣 >≡ 𝑑𝑓 𝑑𝑣 𝑣 𝜙 𝑡 𝑓: ℳ 𝑘 → ℝ𝑝 2016/8/11 13
  14. 14. Exponential map • Geodesic: 𝜙 𝑡 • 𝜙 0 = 𝑝, 𝜙 𝑣 = 𝑞, 𝑑𝜙 𝑡 𝑑𝑡 |0 = 𝑣 Geometric Methods and Manifold Learning – p. 1 exp 𝑝: 𝑇𝑝ℳ 𝑘 → ℳ 𝑘 exp 𝑝 𝑣 = 𝑟 exp 𝑝 𝑤 = 𝑞𝑞 𝑝 𝑟 𝑣𝑤 𝜙 𝑡 2016/8/11 14
  15. 15. Laplace-Beltrami operator • Orthonormal coordinate system. Geometric Methods and Manifold Learning – p. 1 𝑝 𝑥2 𝑥1 𝑓: ℳ 𝑘 → ℝ exp 𝑝: 𝑇𝑝ℳ 𝑘 → ℳ 𝑘 2016/8/11 15
  16. 16. Linear Manifold Learning • Principal Components Analysis • Multidimensional Scaling 2016/8/11 16
  17. 17. Principal Components Analysis • Given x1, x2… , x 𝑛 ∈ ℝ 𝐷 with mean 0 • Find 𝑦1, 𝑦2 … , 𝑦𝑛 ∈ ℝ such that 𝑦𝑖 = w ∙ x 𝑖 • And argmax w 𝑣𝑎𝑟({𝑦𝑖}) = 𝑖 𝑦𝑖 2 = w 𝑇 𝑖 x𝑖x𝑖 𝑇 w • w∗ is leading eigenvectors of 𝑖 x𝑖x𝑖 𝑇 2016/8/11 17
  18. 18. Multidimensional Scaling • MDS: exploring similarities or dissimilarities in data. • Given 𝑁 data points with distance function is defined as: 𝛿𝑖,𝑗 • The dissimilarity matrix can be defined as: Δ ≔ 𝛿1,1 … 𝛿1,𝑁 ⋮ ⋱ ⋮ 𝛿 𝑁,1 … 𝛿 𝑁,𝑁 Find x1, x2… , x 𝑁 ∈ ℝ 𝐷 such that min x1,... ,x 𝑁 𝑖<𝑗 ( x 𝑖 − x𝑗 − 𝛿𝑖,𝑗)2 2016/8/11 18
  19. 19. Nonlinear Manifold Learning • ISOMAP (Tenenbaum, et al, 00) • LLE (Roweis, Saul, 00) • Laplacian Eigenmaps (Belkin, Niyogi, 01) 2016/8/11 19
  20. 20. Algorithmic framework • Neighborhood graph common to all methods. 2016/8/11 20
  21. 21. Isomap: Motivation • PCA/MDS see just the Euclidean structure • Only geodesic distances reflect the true low-dimensional geometry of the manifold • The question: – How to approximate geodesic distances? 2016/8/11 21
  22. 22. Isomap 1. Construct neighborhood graph 𝒅 𝒙(𝒊, 𝒋) using Euclidean distance – 𝜖-Isomap: neighbors within a radius 𝜖 – 𝐾-Isomap: 𝐾 nearest neighbors 2. Compute shortest path as the approximation of geodesic distance 1. 𝒅 𝑮 𝒊, 𝒋 = 𝒅 𝒙(𝒊, 𝒋) 2. For 𝒌 = 𝟏, 𝟐, … , 𝑵, replace all 𝒅 𝑮 𝒊, 𝒋 by 𝐦𝐢𝐧 𝒅 𝑮 𝒊, 𝒋 , 𝒅 𝑮 𝒊, 𝒌 + 𝒅 𝑮 𝒌, 𝒋 3. Construct 𝑑-dimensional embedding using MDS 2016/8/11 22
  23. 23. Isomap: results 2016/8/11 23 Face varying in pose and illumination Hand varying in finger extension & wrist rotation
  24. 24. Isomap: estimate the intrinsic dimensionality 2016/8/11 24
  25. 25. Locally Linear Embedding • Intuition: each data point and its neighbors are expected to lie on or close to a locally linear patch of the manifold. 2016/8/11 25
  26. 26. Locally Linear Embedding 1. Assign neighbours to each data point (𝑘-NN) 2. Reconstruct each point by a weighted linear combination of its neighbors. 3. Map each point to embedded coordinates. 2016/8/11 26 S T Roweis, and L K Saul Science 2000;290:2323-2326
  27. 27. Steps of locally linear embedding • Suppose we have 𝑁 data points 𝑋𝑖 in a 𝐷 dimensional space. • Step 1: Construct neighborhood graph – 𝑘-NN neighborhood – Euclidean distance or normalized dot products 2016/8/11 27
  28. 28. Steps of locally linear embedding • Step 2: Compute the weights 𝑊𝑖𝑗 that best linearly reconstruct 𝑋𝑖 from its neighbors by minimizing where 2016/8/11 28
  29. 29. Steps of locally linear embedding • Step 3: Compute the low-dimensional embedding best reconstructed by 𝑊𝑖𝑗 by minimizing • Note: 𝑊 is a sparse matrix, and 𝑖-th row is barycentric coordinates (center of mass) of 𝑋𝑖 in the basis of its nearest neighbors. • Similar to PCA, using lowest eigenvectors of 𝐼 − 𝑊 𝑇(𝐼 − 𝑊) to embed. 2016/8/11 29
  30. 30. LLE (Comments by Ruimao Zhang) 算法优点 • LLE算法可以学习任意维的局部线性 的低维流形. • LLE算法中的待定参数很少, K 和 d. • LLE算法中每个点的近邻权值在平 移, 旋转,伸缩变换下是保持不变的. • LLE算法有解析的整体最优解,不需 迭代. • LLE算法归结为稀疏矩阵特征值计 算, 计算复杂度相对较小, 容易执行. 算法缺点 • LLE算法要求所学习的流形只能是不 闭合的且在局部是线性的. • LLE算法要求样本在流形上是稠密采 样的. • LLE算法中的参数 K, d 有过多的选择. • LLE算法对样本中的噪音很敏感. 2016/8/11 30
  31. 31. Laplacian Eigenmaps • Using the notion of the Laplacian of a graph to compute a low-dimensional representation of the data – The laplacian of a graph is analogous to the Laplace Beltrami operator on manifolds, of which the eigenfunctions have properties desirable for embedding (See M. Belkin and P. Niyogi for justification). 2016/8/11 31
  32. 32. Laplacian matrix (discrete Laplacian) • Laplacian matrix is a matrix representation of a graph 𝐿 = 𝐷 − 𝐴 – 𝐿 is the Laplacian matrix – 𝐷 is the degree matrix – 𝐴 is the adjacent matrix 2016/8/11 32
  33. 33. Laplacian Eigenmaps 2016/8/11 mlss09us_niyogi_belkin_gmml 33
  34. 34. Laplacian Eigenmaps 2016/8/11 mlss09us_niyogi_belkin_gmml 34
  35. 35. Laplacian Eigenmaps 2016/8/11 mlss09us_niyogi_belkin_gmml 35 𝑫: Degree matrix 𝑾 : (Weighted) adjacent matrix 𝑳 : Laplacian matrix 𝐷−1 𝐿𝑓 = 𝜆𝑓
  36. 36. Justification of optimal embedding • We have constructed a weighted graph 𝐺 = 𝑉, 𝐸 • We want to map 𝐺 to a line 𝒚 so that connected points stay as close together as possible 𝒚 = 𝑦1, 𝑦2, … , 𝑦𝑛 𝑻 • This can be done by minimizing the objective function 𝑖𝑗 𝑦𝑖 − 𝑦𝑗 2 𝑊𝑖𝑗 • It incurs a heavy penalty if neighboring points are mapped far apart. 2016/8/11 36
  37. 37. Justification of optimal embedding (Cont.) 𝑖𝑗 𝑦𝑖 − 𝑦𝑗 2 𝑊𝑖𝑗 = 𝑖𝑗 𝑦𝑖 2 + 𝑦𝑗 2 − 2𝑦𝑖 𝑦𝑗 𝑊𝑖𝑗 = ( 𝑖 𝑦𝑖 2 𝐷𝑖𝑖 + 𝑗 𝑦𝑗 2 𝐷𝑗𝑗) − 2 𝑖,𝑗 𝑦𝑖 𝑦𝑗 𝑊𝑖𝑗 = 2𝒚 𝑇 𝐷𝒚 − 2𝒚 𝑇 𝑊𝒚 = 2𝒚 𝑇(𝐷 − 𝑊)𝒚 = 2𝒚 𝑇 𝐿𝒚 2016/8/11 37 1 2 𝑖𝑗 𝑦𝑖 − 𝑦𝑗 2 𝑊𝑖𝑗 = 𝒚 𝑇 𝐿𝒚
  38. 38. Justification of optimal embedding (Cont.) The minimization problem reduces to finding Note the constraint removes an arbitrary scaling factor in the embedding. Using Lagrange multiplier and setting the derivative with respect to 𝒚 equal to zero, we obtain The optimum is given by the minimum eigenvalue solution to the generalized eigenvalue problem (trivial solution: 𝒚 = 𝟏, 𝜆 = 0). 2016/8/11 38
  39. 39. More methods for non-linear manifold learning 2016/8/11 39
  40. 40. Applications • Super-resolution • Laplacianfaces 2016/8/11 40
  41. 41. Super-Resolution Through Neighbor Embedding • Intuition: small patches in the low- and high-resolution images form manifolds with similar local geometry in two distinct spaces. • X: low-resolution image Y: target high-resolution image • The algorithm is extremely analogous to LLE! – Step 1: construct neighborhood of each patch in X – Step 2: compute the reconstructing weights of the neighbors that minimize the reconstruction error – Step 3: perform high-dimensional embedding to (as opposed to the low-dimensional embedding of LLE) – Step 4: Construct the target high-resolution image Y by enforcing local compatibility and smoothness constraints between adjacent patches obtained in step 3. 2016/8/11 41 Chang, Hong, Dit-Yan Yeung, and Yimin Xiong. "Super-resolution through neighbor embedding." CVPR 2004.
  42. 42. Super-Resolution Through Neighbor Embedding • Training parameters – The number of nearest neighbors K – The patch size – The degree of overlap 2016/8/11 42 Chang, Hong, Dit-Yan Yeung, and Yimin Xiong. "Super-resolution through neighbor embedding." CVPR 2004.
  43. 43. Super-Resolution Through Neighbor Embedding 2016/8/11 43 Chang, Hong, Dit-Yan Yeung, and Yimin Xiong. "Super-resolution through neighbor embedding." CVPR 2004.
  44. 44. Laplacianfaces • Mapping face images in the image space via Locality Preserving Projections (LPP) to low-dimensional face subspace (manifold), called Laplacianfaces. • LLP is analogous to Laplacian Eigenmaps except the objective function – Laplacian Eigenmaps: – LLP: 2016/8/11 44 He, Xiaofei, et al. "Face recognition using laplacianfaces." Pattern Analysis and Machine Intelligence, IEEE Transactions on 27.3 (2005): 328-340.
  45. 45. Laplacianfaces Learning Laplacianfaces for Representation 1. PCA projection (kept 98 percent information in the sense of reconstruction error) 2. Constructing the nearest-neighbor graph 3. Choosing the weights 4. Optimize The k lowest eigenvectors of are choosing to form is the so-called Laplacianfaces. 2016/8/11 45
  46. 46. Laplacianfaces Two-dimensional linear embedding of face images by Laplacianfaces. 2016/8/11 46
  47. 47. Reference and Resources • 浅谈流形学习. Pluskid. 2010-05-29. http://blog.pluskid.org/?p=533&cpage=1 • Wikipedia: MDS, Manifold, Laplacian Matrix • PCA: M. Bishop, PRML • Eigenvalue decomposition and SVD:机器学习中的数学(5)-强大的矩阵奇异值分解(SVD)及其应用 • General Eigenvalue Problem: wolfram, tutorial • Video lecture: Geometric Methods and Manifold Learning. Mikhail Belkin, Partha Niyogi (author of Laplacian eigenmap) • MANI fold Learning Matlab Demo: http://www.math.ucla.edu/~wittman/mani/index.html 2016/8/11 47
  48. 48. Thank you. 2016/8/11 48

×