Successfully reported this slideshow.
Upcoming SlideShare
×

# Manifold learning

2,445 views

Published on

Review of Manifold learning

Published in: Science
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### Manifold learning

1. 1. A Brief Introduction to Manifold Learning Wei Yang platero.yang@gmail.com 2016/8/11 1 Some slides are from Geometric Methods and Manifold Learning in Machine Learning (Mikhail Belkin and Partha Niyoqi). Summer School (MLSS), Chicago 2009
2. 2. What is a manifold? 2016/8/11 2 https://en.wikipedia.org/wiki/Manifold
3. 3. Manifolds in visual perception Consider a simple example of image variability, the set ๐ of all facial images generated by varying the orientation of a face โข 1D manifold: โ a single degree of freedom: the angle of rotation โข The dimensionality of ๐ would increase if we allow โ image scaling โ illumination changing โ โฆ 2016/8/11 3 The Manifold Ways of Perception โขH. Sebastian Seung and โขDaniel D. Lee Science 22 December 2000
4. 4. Why Manifolds? โข Euclidean distance in the high dimensional input space may not accurately reflect the intrinsic similarity โ Euclidean distance โ Geodesic distance 2016/8/11 4 Linear Manifold VS. Nonlinear Manifold
5. 5. Differential Geometry โข Embedded manifolds โข Tangent space โข Geodesic โข Laplace-Beltrami operator 2016/8/11 5
6. 6. Embedded manifolds โข โณ ๐ โ โ ๐ โข Locally (not globally) looks like Euclidean space ๐ 2 โ โ3 2016/8/11 6
7. 7. Example: circle โข ๐ฅ2 + ๐ฆ2 = 1 โข Charts: continuous, invertible ๐ ๐ก๐๐: (๐ฅ, ๐ฆ) โผ ๐ฅ ๐ ๐ก๐๐ โ1 : (๐ฅ) โผ (๐ฅ, 1 โ ๐ฅ2) โข Atlas: charts covered the whole circle โข Transition map ๐ ๐ = ๐ ๐๐๐โ๐ก ๐ ๐ก๐๐ โ1 ๐ 2016/8/11 7 ๐ฅ ๐ฆ = ๐ ๐๐๐โ๐ก(๐, 1 โ ๐2) = 1 โ ๐2 ๐ ๐ ๐ก๐๐ โ1 ๐ ๐ ๐๐๐โ๐ก ๐ ๐ก๐๐ โ1 ๐ http://en.wikipedia.org/wiki/Manifold
8. 8. Tangent space โข ๐-dimensional affine subspace of โ ๐ ๐๐โณ ๐ โ โ ๐ ๐ 2016/8/11 8
9. 9. Tangent vectors and curves โข Tangent vectors <โโโ> curves. Geometric Methods and Manifold Learning โ p. 1 ๐ ๐ก : โ โ โณ ๐ ๐ ๐ ๐ก ๐ ๐ก |0 = ๐ ๐ฃ ๐ ๐ก ๐ 2016/8/11 9
10. 10. Tangent vectors as derivatives โข Tangent vectors <โโโ> Directional derivatives Geometric Methods and Manifold Learning โ p. 1 ๐ ๐ก : โ โ โณ ๐ ๐ ๐ ๐ก : โ โ โ ๐๐ ๐๐ฃ = ๐ ๐ ๐ก ๐ ๐ก |0 ๐ฃ ๐ ๐ก ๐: โณ ๐ โ โ๐ 2016/8/11 10
11. 11. Riemannian geometry โข Norms and angles in tangent space Geometric Methods and Manifold Learning โ p. 1 < ๐ฃ, ๐ค > ๐ฃ , ๐ค ๐ฃ ๐ค๐ 2016/8/11 11
12. 12. Length of curves and geodesics โข Can measure length using norm in tangent space. โข Geodesic โ shortest curve between two points. Geometric Methods and Manifold Learning โ p. 1 ๐ ๐ก : [0,1] โ โณ ๐ ๐ ๐ = 0 1 ๐๐ ๐๐ก ๐๐ก ๐ 2016/8/11 12
13. 13. Gradient โข Tangent vectors <โโโ> Directional derivatives โข Gradient points in the direction of maximum change. Geometric Methods and Manifold Learning โ p. 1 < ๐ป๐, ๐ฃ >โก ๐๐ ๐๐ฃ ๐ฃ ๐ ๐ก ๐: โณ ๐ โ โ๐ 2016/8/11 13
14. 14. Exponential map โข Geodesic: ๐ ๐ก โข ๐ 0 = ๐, ๐ ๐ฃ = ๐, ๐๐ ๐ก ๐๐ก |0 = ๐ฃ Geometric Methods and Manifold Learning โ p. 1 exp ๐: ๐๐โณ ๐ โ โณ ๐ exp ๐ ๐ฃ = ๐ exp ๐ ๐ค = ๐๐ ๐ ๐ ๐ฃ๐ค ๐ ๐ก 2016/8/11 14
15. 15. Laplace-Beltrami operator โข Orthonormal coordinate system. Geometric Methods and Manifold Learning โ p. 1 ๐ ๐ฅ2 ๐ฅ1 ๐: โณ ๐ โ โ exp ๐: ๐๐โณ ๐ โ โณ ๐ 2016/8/11 15
16. 16. Linear Manifold Learning โข Principal Components Analysis โข Multidimensional Scaling 2016/8/11 16
17. 17. Principal Components Analysis โข Given x1, x2โฆ , x ๐ โ โ ๐ท with mean 0 โข Find ๐ฆ1, ๐ฆ2 โฆ , ๐ฆ๐ โ โ such that ๐ฆ๐ = w โ x ๐ โข And argmax w ๐ฃ๐๐({๐ฆ๐}) = ๐ ๐ฆ๐ 2 = w ๐ ๐ x๐x๐ ๐ w โข wโ is leading eigenvectors of ๐ x๐x๐ ๐ 2016/8/11 17
18. 18. Multidimensional Scaling โข MDS: exploring similarities or dissimilarities in data. โข Given ๐ data points with distance function is defined as: ๐ฟ๐,๐ โข The dissimilarity matrix can be defined as: ฮ โ ๐ฟ1,1 โฆ ๐ฟ1,๐ โฎ โฑ โฎ ๐ฟ ๐,1 โฆ ๐ฟ ๐,๐ Find x1, x2โฆ , x ๐ โ โ ๐ท such that min x1,... ,x ๐ ๐<๐ ( x ๐ โ x๐ โ ๐ฟ๐,๐)2 2016/8/11 18
19. 19. Nonlinear Manifold Learning โข ISOMAP (Tenenbaum, et al, 00) โข LLE (Roweis, Saul, 00) โข Laplacian Eigenmaps (Belkin, Niyogi, 01) 2016/8/11 19
20. 20. Algorithmic framework โข Neighborhood graph common to all methods. 2016/8/11 20
21. 21. Isomap: Motivation โข PCA/MDS see just the Euclidean structure โข Only geodesic distances reflect the true low-dimensional geometry of the manifold โข The question: โ How to approximate geodesic distances? 2016/8/11 21
22. 22. Isomap 1. Construct neighborhood graph ๐ ๐(๐, ๐) using Euclidean distance โ ๐-Isomap: neighbors within a radius ๐ โ ๐พ-Isomap: ๐พ nearest neighbors 2. Compute shortest path as the approximation of geodesic distance 1. ๐ ๐ฎ ๐, ๐ = ๐ ๐(๐, ๐) 2. For ๐ = ๐, ๐, โฆ , ๐ต, replace all ๐ ๐ฎ ๐, ๐ by ๐ฆ๐ข๐ง ๐ ๐ฎ ๐, ๐ , ๐ ๐ฎ ๐, ๐ + ๐ ๐ฎ ๐, ๐ 3. Construct ๐-dimensional embedding using MDS 2016/8/11 22
23. 23. Isomap: results 2016/8/11 23 Face varying in pose and illumination Hand varying in finger extension & wrist rotation
24. 24. Isomap: estimate the intrinsic dimensionality 2016/8/11 24
25. 25. Locally Linear Embedding โข Intuition: each data point and its neighbors are expected to lie on or close to a locally linear patch of the manifold. 2016/8/11 25
26. 26. Locally Linear Embedding 1. Assign neighbours to each data point (๐-NN) 2. Reconstruct each point by a weighted linear combination of its neighbors. 3. Map each point to embedded coordinates. 2016/8/11 26 S T Roweis, and L K Saul Science 2000;290:2323-2326
27. 27. Steps of locally linear embedding โข Suppose we have ๐ data points ๐๐ in a ๐ท dimensional space. โข Step 1: Construct neighborhood graph โ ๐-NN neighborhood โ Euclidean distance or normalized dot products 2016/8/11 27
28. 28. Steps of locally linear embedding โข Step 2: Compute the weights ๐๐๐ that best linearly reconstruct ๐๐ from its neighbors by minimizing where 2016/8/11 28
29. 29. Steps of locally linear embedding โข Step 3: Compute the low-dimensional embedding best reconstructed by ๐๐๐ by minimizing โข Note: ๐ is a sparse matrix, and ๐-th row is barycentric coordinates (center of mass) of ๐๐ in the basis of its nearest neighbors. โข Similar to PCA, using lowest eigenvectors of ๐ผ โ ๐ ๐(๐ผ โ ๐) to embed. 2016/8/11 29
30. 30. LLE (Comments by Ruimao Zhang) ็ฎๆณไผ็น โข LLE็ฎๆณๅฏไปฅๅญฆไน ไปปๆ็ปด็ๅฑ้จ็บฟๆง ็ไฝ็ปดๆตๅฝข. โข LLE็ฎๆณไธญ็ๅพๅฎๅๆฐๅพๅฐ, K ๅ d. โข LLE็ฎๆณไธญๆฏไธช็น็่ฟ้ปๆๅผๅจๅนณ ็งป, ๆ่ฝฌ,ไผธ็ผฉๅๆขไธๆฏไฟๆไธๅ็. โข LLE็ฎๆณๆ่งฃๆ็ๆดไฝๆไผ่งฃ,ไธ้ ่ฟญไปฃ. โข LLE็ฎๆณๅฝ็ปไธบ็จ็็ฉ้ต็นๅพๅผ่ฎก ็ฎ, ่ฎก็ฎๅคๆๅบฆ็ธๅฏน่พๅฐ, ๅฎนๆๆง่ก. ็ฎๆณ็ผบ็น โข LLE็ฎๆณ่ฆๆฑๆๅญฆไน ็ๆตๅฝขๅช่ฝๆฏไธ ้ญๅ็ไธๅจๅฑ้จๆฏ็บฟๆง็. โข LLE็ฎๆณ่ฆๆฑๆ ทๆฌๅจๆตๅฝขไธๆฏ็จ ๅฏ้ ๆ ท็. โข LLE็ฎๆณไธญ็ๅๆฐ K, d ๆ่ฟๅค็้ๆฉ. โข LLE็ฎๆณๅฏนๆ ทๆฌไธญ็ๅช้ณๅพๆๆ. 2016/8/11 30
31. 31. Laplacian Eigenmaps โข Using the notion of the Laplacian of a graph to compute a low-dimensional representation of the data โ The laplacian of a graph is analogous to the Laplace Beltrami operator on manifolds, of which the eigenfunctions have properties desirable for embedding (See M. Belkin and P. Niyogi for justification). 2016/8/11 31
32. 32. Laplacian matrix (discrete Laplacian) โข Laplacian matrix is a matrix representation of a graph ๐ฟ = ๐ท โ ๐ด โ ๐ฟ is the Laplacian matrix โ ๐ท is the degree matrix โ ๐ด is the adjacent matrix 2016/8/11 32
33. 33. Laplacian Eigenmaps 2016/8/11 mlss09us_niyogi_belkin_gmml 33
34. 34. Laplacian Eigenmaps 2016/8/11 mlss09us_niyogi_belkin_gmml 34
35. 35. Laplacian Eigenmaps 2016/8/11 mlss09us_niyogi_belkin_gmml 35 ๐ซ: Degree matrix ๐พ : (Weighted) adjacent matrix ๐ณ : Laplacian matrix ๐ทโ1 ๐ฟ๐ = ๐๐
36. 36. Justification of optimal embedding โข We have constructed a weighted graph ๐บ = ๐, ๐ธ โข We want to map ๐บ to a line ๐ so that connected points stay as close together as possible ๐ = ๐ฆ1, ๐ฆ2, โฆ , ๐ฆ๐ ๐ป โข This can be done by minimizing the objective function ๐๐ ๐ฆ๐ โ ๐ฆ๐ 2 ๐๐๐ โข It incurs a heavy penalty if neighboring points are mapped far apart. 2016/8/11 36
37. 37. Justification of optimal embedding (Cont.) ๐๐ ๐ฆ๐ โ ๐ฆ๐ 2 ๐๐๐ = ๐๐ ๐ฆ๐ 2 + ๐ฆ๐ 2 โ 2๐ฆ๐ ๐ฆ๐ ๐๐๐ = ( ๐ ๐ฆ๐ 2 ๐ท๐๐ + ๐ ๐ฆ๐ 2 ๐ท๐๐) โ 2 ๐,๐ ๐ฆ๐ ๐ฆ๐ ๐๐๐ = 2๐ ๐ ๐ท๐ โ 2๐ ๐ ๐๐ = 2๐ ๐(๐ท โ ๐)๐ = 2๐ ๐ ๐ฟ๐ 2016/8/11 37 1 2 ๐๐ ๐ฆ๐ โ ๐ฆ๐ 2 ๐๐๐ = ๐ ๐ ๐ฟ๐
38. 38. Justification of optimal embedding (Cont.) The minimization problem reduces to finding Note the constraint removes an arbitrary scaling factor in the embedding. Using Lagrange multiplier and setting the derivative with respect to ๐ equal to zero, we obtain The optimum is given by the minimum eigenvalue solution to the generalized eigenvalue problem (trivial solution: ๐ = ๐, ๐ = 0). 2016/8/11 38
39. 39. More methods for non-linear manifold learning 2016/8/11 39
40. 40. Applications โข Super-resolution โข Laplacianfaces 2016/8/11 40
41. 41. Super-Resolution Through Neighbor Embedding โข Intuition: small patches in the low- and high-resolution images form manifolds with similar local geometry in two distinct spaces. โข X: low-resolution image Y: target high-resolution image โข The algorithm is extremely analogous to LLE! โ Step 1: construct neighborhood of each patch in X โ Step 2: compute the reconstructing weights of the neighbors that minimize the reconstruction error โ Step 3: perform high-dimensional embedding to (as opposed to the low-dimensional embedding of LLE) โ Step 4: Construct the target high-resolution image Y by enforcing local compatibility and smoothness constraints between adjacent patches obtained in step 3. 2016/8/11 41 Chang, Hong, Dit-Yan Yeung, and Yimin Xiong. "Super-resolution through neighbor embedding." CVPR 2004.
42. 42. Super-Resolution Through Neighbor Embedding โข Training parameters โ The number of nearest neighbors K โ The patch size โ The degree of overlap 2016/8/11 42 Chang, Hong, Dit-Yan Yeung, and Yimin Xiong. "Super-resolution through neighbor embedding." CVPR 2004.
43. 43. Super-Resolution Through Neighbor Embedding 2016/8/11 43 Chang, Hong, Dit-Yan Yeung, and Yimin Xiong. "Super-resolution through neighbor embedding." CVPR 2004.
44. 44. Laplacianfaces โข Mapping face images in the image space via Locality Preserving Projections (LPP) to low-dimensional face subspace (manifold), called Laplacianfaces. โข LLP is analogous to Laplacian Eigenmaps except the objective function โ Laplacian Eigenmaps: โ LLP: 2016/8/11 44 He, Xiaofei, et al. "Face recognition using laplacianfaces." Pattern Analysis and Machine Intelligence, IEEE Transactions on 27.3 (2005): 328-340.
45. 45. Laplacianfaces Learning Laplacianfaces for Representation 1. PCA projection (kept 98 percent information in the sense of reconstruction error) 2. Constructing the nearest-neighbor graph 3. Choosing the weights 4. Optimize The k lowest eigenvectors of are choosing to form is the so-called Laplacianfaces. 2016/8/11 45
46. 46. Laplacianfaces Two-dimensional linear embedding of face images by Laplacianfaces. 2016/8/11 46
47. 47. Reference and Resources โข ๆต่ฐๆตๅฝขๅญฆไน . Pluskid. 2010-05-29. http://blog.pluskid.org/?p=533&cpage=1 โข Wikipedia: MDS, Manifold, Laplacian Matrix โข PCA: M. Bishop, PRML โข Eigenvalue decomposition and SVD:ๆบๅจๅญฆไน ไธญ็ๆฐๅญฆ(5)-ๅผบๅคง็็ฉ้ตๅฅๅผๅผๅ่งฃ(SVD)ๅๅถๅบ็จ โข General Eigenvalue Problem: wolfram, tutorial โข Video lecture: Geometric Methods and Manifold Learning. Mikhail Belkin, Partha Niyogi (author of Laplacian eigenmap) โข MANI fold Learning Matlab Demo: http://www.math.ucla.edu/~wittman/mani/index.html 2016/8/11 47
48. 48. Thank you. 2016/8/11 48