SlideShare a Scribd company logo
1 of 24
Download to read offline
Riemannian stochastic variance reduced gradient
on Grassmann manifold
Hiroyuki Kasai†, Hiroyuki Sato§, and Bamdev Mishra††
†The University of Electro-Communications, Japan
§Tokyo University of Science, Japan
††Amazon Development Centre India, India
August 10, 2016
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 1
Summary (our contributions)
Address stochastic gradient descent (SGD) algorithm for
empirical risk minimization problem as
min
w∈Rd
n
i=1
fi(w).
Paritularlly, structured problems on manifolds, i.e., w ∈ M.
Propose Riemannian SVRG (R-SVRG).
Extend SVRG in the Euclidean into Riemannian manifolds.
Give two analyses;
Global convergence analysis, and
Local convergence rate analysis.
Show effectiveness of R-SVRG from numerical comparisons.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 2
Stochastic gradient method (SGD) (1)
Update in SGD
wk = wk−1
current point
− αk
single gradient for ik-th sample
(= stochastic gradient)
fik
random sample
(wk−1)
Unbiased expectation of full gradient as
E[ fi(w)] =
1
n
n
i=1
fi(w) = f(w).
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 3
Stochastic gradient descent (SGD) (2)
Features against full gradient descent (FGD)
Pros: High scalability to large-scale data
Iteration complexity is independent of n.
FGD shows linear complexity in n.
Cons: Slow convergence property
Decaying stepsizes for convergence to avoid
big fluctuations around a solution due to a large step-size.
too slow convergence due to a too small step-size.
⇓
Sub-linear convergence rate E[f(wk)] − f(w∗
) ∈ O(k−1
).
FGD shows f(wk) − f(w∗
) ∈ O(ck
).
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 4
Speeding up of SGD: Variance reduction technique
Accelerate the convergence rate of SGD
[Mairal, 2015, Roux et al., 2012,
Shalev-Shwartz and Zhang, 2012,
Shalev-Shwartz and Zhang, 2013, Defazio et al., 2014,
Zhang and Xiao, 2014].
Stochastic variance reduced gradient (SVRG)
[Johnson and Zhang, 2013]
linear convergence rate for strongly-convex functions.
Various variants
[Garber and Hazan, 2015] analyze the convergence rate for
SVRG when f is a convex function that is a sum of
non-convex (but smooth) terms.
[Shalev-Shwartz, 2015] proposes similar results.
[Allen-Zhu and Yan, 2015] further study the same case with
better convergence rates.
[Shamir, 2015] studies specifically the convergence properties
of the variance reduction PCA algorithm.
Very recently, [Allen-Zhu and Hazan, 2016] propose a variance
reduction method for faster non-convex optimization.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 5
Stochastic variance reduced gradient (SVRG) (1)
[Johnson and Zhang, 2013]
Motivations:
Reduce the variance of stochastic gradients.
No need to store all gradients not like SAG.
But, allow additional calculations of gradients.
Basic idea: hybrid algorithm of SGD and FGD.
Periodically, calculate and store a full gradient.
Every iteration, adjust a stochastic gradient v by the latest full
gradient to reduce variance.
⇓
Linear convergence rate
E[f( ˜ws)]−E[f( ˜w∗
)]≤αs
(E[f( ˜w0)]−E[f( ˜w∗
)])
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 6
Stochastic variance reduced gradient (SVRG) (2)
Simplified algorithm of SVRG
1: Initial iterate w0
0 ∈ M.
2: for s = 1, 2, . . . (outer loop) do
3: Store ˜w = ws−1
t .
4: Store f( ˜w).
5: for t = 1, 2, . . . , ms (inner loop) do
6: Calculate
modified stochastic gradient
vs
t = fis
t
(ws
t−1)
single gradient at ws
t−1
−
single gradient
fis
t
( ˜w)+ f( ˜w).
full gradient
7: Update ws
t = ws
t−1 − αvs
t .
8: end for
9: end for
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 7
Stochastic variance reduced gradient (SVRG) (3)
[Johnson and Zhang, 2013]
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 8
Structured problems
Examples
PCA problem: calculate the projection matrix U to minimize
as
min
U∈St(r,d)
1
n
n
i=1
xi − UUT
xi
2
2,
U belongs to Stiefel manifold St(r, d).
The set of matrices of size d × r with orthonormal columns,
i.e., UT
U = I.
⇓
Cost function remains unchanged under the orthogonal group
action U → UO for O ∈ O(r).
⇓
U belongs to Grassmann manifold Gr(r, d).
The set of r-dimensional linear subspaces in Rd
with
orthonormal columns, i.e., UT
U = I.
Other examples (not exchasted)
matrix completion, subspace tracking, spectral clustering,
CCA, bi-factor regression, ....Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 9
Optimization on Riemannian manifolds
[Absil et al., 2008]
If constraints can be defined by a manifold, the constrained
problem is viewed as unconstrained problem on the manifold
as
min
w∈Rn
f(w), s.t. ci(w) = 0, cj(w) ≤ 0
⇓
min
w∈M
f(w), M : Riemannian manifold
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 10
Riemannian SGD (R-SGD) (1)
[Bonnabel, 2013]
Extension of Euclidean SGD into Riemannian manifolds.
Update in R-SGD
wk =
Move along geodesic
(by exponential mapping)
Expwk−1
(−αk gradfik
(wk−1)
Riemannian stochastic gradient
)
1. Calculate a Riemannian stochastic gradient gradfik
(wk−1) for
the sample ik at wk−1.
2. Then, move along the geodesic from wk−1 in the direction of
gradfik
(wk−1).
Geodesic is generalization of straight lines in Euclidean space.
Exponential mapping Expw(·) specifies the geodesic.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 11
Riemannian SGD (R-SGD) (2)
[Bonnabel, 2013]
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 12
Proposal: Riemannian SVRG (R-SVRG)
[Kasai et al., 2016]
Propose a novel extension of SVRG in the Euclidean space to
the Riemannian manifold search space.
Extension is not trivial.
Focus on the Grassmann manifold Gr(r, d).
Can be generalized to other compact Riemannian manifolds.
Notations
SVRG R-SVRG
Model parameter ws
t−1 ∈ Rn Us
t−1 ∈ Gr(r, d)
Edge point of outer loop ˜w ∈ Rn ˜U ∈ Gr(r, d)
Stochastic gradient fis
t
(ws
t−1) ∈ Rn gradfis
t
(Us
t−1) ∈ TUs
t−1
Gr(r, d)
Modified stochastic vs
t ∈ Rn ξs
t ∈ TUs
t−1
Gr(r, d)
gradient
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 13
Proposal: Riemannian SVRG (R-SVRG)
Algorithm
Straightforward modification of stochastic gradient
Extend SVRG case: vs
t = fis
t
(ws
t−1) − fit
( ˜w) + f( ˜w).
ξs
t = gradfis
t
(Us
t−1) − gradfis
t
(˜U) + gradf(˜U)
Meaningless because manifolds are not vector space.
⇓
Proposed modification
Transport vectors at ˜U into the current tangent space at Us
t−1
by parallel translation, then add them.
ξs
t = gradfis
t
(Us
t−1)
+
parallel−translation operator
P
U
s
t−1←˜U
γ
geodesic
−gradfis
t
(˜U) + gradf(˜U)
Logarithm mapping gives the tangent vector for geodesic γ.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 14
Proposal: Riemannian SVRG (R-SVRG)
Conceptual illustration
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 15
Tools in Grassmann manifold
Exponential mapping in the direction of ξ ∈ TU(0)
U(t) = [U(0)V W]


cos tΣ
sin tΣ

 VT
,
ξ = WΣVT
is the rank-r singular value decomposition of ξ.
cos(·) and sin(·) operations are only on the diagonal entries.
Parallel translation of ζ ∈ TU(0) along γ(t) with ξ
ζ(t) =

[U(0)V W]


− sin tΣ
cos tΣ

 WT
+ (I − WWT
)

 ζ.
Logarithm mapping of U(t) at U(0)
ξ = logU(0)(U(t)) = W arctan(Σ)VT
,
WΣVT
is the rank-r singular value decomposition of
(U(t) − U(0)U(0)T
U(t))(U(0)T
U(t))−1
.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 16
Main results: convergence analyses
Global convergence analysis with decaying step-sizes.
Guarantee that the iteration globally converges to a critical
point starting from any initialization point.
Local convergence rate analysis under fixed step-size.
Consider the rate in neighborhood of a local minimum.
Assume that Lipschitz smoothness and lower bound of Hessian
hold only in this neighborhood.
Obtain local linear convergence rate as
E[(dist(˜U
s
, U∗
))2
] ≤
4(1 + 8mα2
β2
)
αm(σ − 14ηβ2)
E[(dist(˜U
s−1
, U∗
))2
].
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 17
Proof sketch for local convergence rate
1. Obtain below by assuming the smallest eigenvalue σ of
Hessian of f as
f(z) ≥ f(w) + Exp−1
w (z), gradf(w) w +
σ
2
Exp−1
w (z) 2
w, w, z ∈ U. (1)
2. Obtain the variance of ξs
t from β-Lipschitz continuity as
Eis
t
[ ξs
t
2
] ≤ β2
(14(dist(ws
t−1, w∗
))2
+ 8dist( ˜ws−1
, w∗
))2
) (2)
3. Obtain the expectation of the decrease of the distance to the
solution in the inner iteration from the lemma for a geodesic
triangle in an Alexandrov space as
Eis
t
(dist(Us
t , U∗
))2
− (dist(Us
t−1, U∗
))2
≤ Eis
t
[(dist(Us
t−1, Us
t ))2
+ 2η gradf(Us
t−1), Exp−1
Us
t−1
(U∗
) Us
t−1
]. (3)
4. Putting (1)&(2) into (3) with summing over the inner loop
finally yields the decrease of the distance to the solution in
the outer iteration.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 18
Numerical comparisons
Experiments conditions
Compare R-SVRG with
1. R-SGD
2. R-SD (steepest descent) with backtracking line search
Step-size algorithms
1. fixed step-size
2. decaying step-sizes
3. hybrid step-sizes
Use the decaying step-sizes at less than sT H (= 5) epoch, and
subsequently switches to a fixed step-size.
PCA problem
n = 10000, d = 20, and r = 5.
Evaluation metrics
Optimality gap
Distance to the minimum loss obtained by Matlab pca.
Norm of gradient
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 19
Numerical comparisons
Results for PCA problem
#grad/N
0 50 100 150 200 250
Trainloss-optimum
10-10
10
-5
100
R-SD
R-SGD : decay (η=0.009, λ=0.1)
R-SVRG : fix (η=0.001)
R-SVRG : decay (η=0.001, λ=0.001)
R-SVRG : hybrid (η=0.004, λ=0.01)
R-SVRG+ : fix (η=0.001)
R-SVRG+ : decay (η=0.002, λ=0.01)
R-SVRG+ : hybrid (η=0.002, λ=0.01)
(a) Optimality gap.
#grad/N
0 50 100 150 200 250
Normofgradient
10-5
10
0
R-SD
R-SGD : decay (η=0.009, λ=0.1)
R-SVRG : fix (η=0.001)
R-SVRG : decay (η=0.001, λ=0.001)
R-SVRG : hybrid (η=0.004, λ=0.01)
R-SVRG+ : fix (η=0.001)
R-SVRG+ : decay (η=0.002, λ=0.01)
R-SVRG+ : hybrid (η=0.002, λ=0.01)
(b) Norm of gradient.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 20
Conclusions and more information
Conclusions
Propose Riemannian SVRG (R-SVRG).
R-SVRG shows local linear convergence rate.
Numerical comparisons shows the effectiveness of the
algorithm.
More information
Full paper
H.Kasai, H.Sato and B.Mishra, ”Riemannian stochastic
variance reduced gradient on Grassmann manifold,”
arXiv:1605.07367, May 2016, [Kasai et al., 2016]
Matlab code
https://bamdevmishra.com/codes/rsvrg/
Thank you for your attention.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 21
References I
Absil, P.-A., Mahony, R., and Sepulchre, R. (2008).
Optimization Algorithms on Matrix Manifolds.
Princeton University Press.
Allen-Zhu, Z. and Hazan, E. (2016).
Variance reduction for faster non-convex optimization.
Technical report, arXiv preprint arXiv:1603.05643.
Allen-Zhu, Z. and Yan, Y. (2015).
Improved SVRG for non-strongly-convex or sum-of-non-convex objectives.
Technical report, arXiv preprint arXiv:1506.01972.
Bonnabel, S. (2013).
Stochastic gradient descent on Riemannian manifolds.
IEEE Trans. on Automatic Control, 58(9):2217–2229.
Defazio, A., Bach, F., and Lacoste-Julien, S. (2014).
SAGA: A fast incremental gradient method with support for non-strongly convex
composite objectives.
In NIPS.
Garber, D. and Hazan, E. (2015).
Fast and simple PCA via convex optimization.
Technical report, arXiv preprint arXiv:1509.05647.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 22
References II
Johnson, R. and Zhang, T. (2013).
Accelerating stochastic gradient descent using predictive variance reduction.
In NIPS, pages 315–323.
Kasai, H., Sato, H., and Mishra, B. (2016).
Riemannian stochastic variance reduced gradient on grassmann manifold.
arXiv preprint: arXiv:1605.07367.
Mairal, J. (2015).
Incremental majorization-minimization optimization with application to largescale
machine learning.
SIAM J. Optim., 25(2):829–855.
Roux, N. L., Schmidt, M., and Bach, F. R. (2012).
A stochastic gradient method with an exponential convergence rate for finite
training sets.
In NIPS, pages 2663–2671.
Shalev-Shwartz, S. (2015).
SDCA without duality.
Technical report, arXiv preprint arXiv:1502.06177.
Shalev-Shwartz, S. and Zhang, T. (2012).
Proximal stochastic dual coordinate ascent.
Technical report, arXiv preprint arXiv:1211.2717.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 23
References III
Shalev-Shwartz, S. and Zhang, T. (2013).
Stochastic dual coordinate ascent methods for regularized loss minimization.
JMRL, 14:567–599.
Shamir, O. (2015).
Fast stochastic algorithms for SVD and PCA: Convergence properties and
convexity.
Technical report, arXiv preprint arXiv:1507.08788.
Zhang, Y. and Xiao, L. (2014).
Stochastic primal-dual coordinate method for regularized empirical risk
minimization.
SIAM J. Optim., 24(4):2057–2075.
Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 24

More Related Content

What's hot

[ICRA 2019] Lecture 1: Learning Dynamical Systems from Demonstrations
[ICRA 2019] Lecture 1: Learning Dynamical Systems from Demonstrations[ICRA 2019] Lecture 1: Learning Dynamical Systems from Demonstrations
[ICRA 2019] Lecture 1: Learning Dynamical Systems from DemonstrationsNadia Barbara
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Seunghyun Hwang
 
Learning from positive and unlabeled data
Learning from positive and unlabeled dataLearning from positive and unlabeled data
Learning from positive and unlabeled dataData Science Leuven
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Eiji Sekiya
 
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationSubspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationUnited States Air Force Academy
 
Mekanika e shkatërrimit I-Qëndrueshmëria (soliditeti)
Mekanika e shkatërrimit I-Qëndrueshmëria (soliditeti)Mekanika e shkatërrimit I-Qëndrueshmëria (soliditeti)
Mekanika e shkatërrimit I-Qëndrueshmëria (soliditeti)Rrahim Maksuti
 
From logistic regression to linear chain CRF
From logistic regression to linear chain CRFFrom logistic regression to linear chain CRF
From logistic regression to linear chain CRFDarren Yow-Bang Wang
 
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...Antonio Tejero de Pablos
 
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)Weiwei Guo
 
[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...
[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...
[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...University of Bologna
 
「カルチョビット」で選手のトレーニングメニューを最適化してみた
「カルチョビット」で選手のトレーニングメニューを最適化してみた「カルチョビット」で選手のトレーニングメニューを最適化してみた
「カルチョビット」で選手のトレーニングメニューを最適化してみたMasashi Yamamoto
 
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...MLconf
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splattingtaeseon ryu
 
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), FooladSuperpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), FooladShima Foolad
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examplesFelipe
 
Basic Technology - Module 13 cloud computing
Basic Technology - Module 13 cloud computingBasic Technology - Module 13 cloud computing
Basic Technology - Module 13 cloud computingsolarisyougood
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Marina Santini
 

What's hot (20)

[ICRA 2019] Lecture 1: Learning Dynamical Systems from Demonstrations
[ICRA 2019] Lecture 1: Learning Dynamical Systems from Demonstrations[ICRA 2019] Lecture 1: Learning Dynamical Systems from Demonstrations
[ICRA 2019] Lecture 1: Learning Dynamical Systems from Demonstrations
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...
 
Learning from positive and unlabeled data
Learning from positive and unlabeled dataLearning from positive and unlabeled data
Learning from positive and unlabeled data
 
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
 
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationSubspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
 
Tabnet presentation
Tabnet presentationTabnet presentation
Tabnet presentation
 
Mekanika e shkatërrimit I-Qëndrueshmëria (soliditeti)
Mekanika e shkatërrimit I-Qëndrueshmëria (soliditeti)Mekanika e shkatërrimit I-Qëndrueshmëria (soliditeti)
Mekanika e shkatërrimit I-Qëndrueshmëria (soliditeti)
 
From logistic regression to linear chain CRF
From logistic regression to linear chain CRFFrom logistic regression to linear chain CRF
From logistic regression to linear chain CRF
 
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
 
Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)
 
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
Deep Natural Language Processing for Search Systems (sigir 2019 tutorial)
 
[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...
[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...
[MIPRO2019] Map-Matching on Big Data: a Distributed and Efficient Algorithm w...
 
「カルチョビット」で選手のトレーニングメニューを最適化してみた
「カルチョビット」で選手のトレーニングメニューを最適化してみた「カルチョビット」で選手のトレーニングメニューを最適化してみた
「カルチョビット」で選手のトレーニングメニューを最適化してみた
 
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Lab...
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splatting
 
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), FooladSuperpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
Superpixel algorithms (whatershed, mean-shift, SLIC, BSLIC), Foolad
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examples
 
Basic Technology - Module 13 cloud computing
Basic Technology - Module 13 cloud computingBasic Technology - Module 13 cloud computing
Basic Technology - Module 13 cloud computing
 
2.2.ppt.SC
2.2.ppt.SC2.2.ppt.SC
2.2.ppt.SC
 
Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)Lecture 7: Hidden Markov Models (HMMs)
Lecture 7: Hidden Markov Models (HMMs)
 

Viewers also liked

ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...Hiroyuki KASAI
 
Basics of Vedic Mathematics - Multiplication (1 of 2)
Basics of Vedic Mathematics - Multiplication (1 of 2)Basics of Vedic Mathematics - Multiplication (1 of 2)
Basics of Vedic Mathematics - Multiplication (1 of 2)A V Prakasam
 
Parallel Numerical Methods for Ordinary Differential Equations: a Survey
Parallel Numerical Methods for Ordinary Differential Equations: a SurveyParallel Numerical Methods for Ordinary Differential Equations: a Survey
Parallel Numerical Methods for Ordinary Differential Equations: a SurveyUral-PDC
 
The two dimensional wave equation
The two dimensional wave equationThe two dimensional wave equation
The two dimensional wave equationGermán Ceballos
 
Waves in 2 Dimensions
Waves in 2 DimensionsWaves in 2 Dimensions
Waves in 2 DimensionsBruce Coulter
 
2 Dimensional Wave Equation Analytical and Numerical Solution
2 Dimensional Wave Equation Analytical and Numerical Solution2 Dimensional Wave Equation Analytical and Numerical Solution
2 Dimensional Wave Equation Analytical and Numerical SolutionAmr Mousa
 
Manifold learning
Manifold learningManifold learning
Manifold learningWei Yang
 
Numerical Methods - Power Method for Eigen values
Numerical Methods - Power Method for Eigen valuesNumerical Methods - Power Method for Eigen values
Numerical Methods - Power Method for Eigen valuesDr. Nirav Vyas
 
Numerical Analysis (Solution of Non-Linear Equations)
Numerical Analysis (Solution of Non-Linear Equations)Numerical Analysis (Solution of Non-Linear Equations)
Numerical Analysis (Solution of Non-Linear Equations)Asad Ali
 
Application of Numerical method in Real Life
Application of Numerical method in Real LifeApplication of Numerical method in Real Life
Application of Numerical method in Real LifeTaqwa It Center
 
Applications of numerical methods
Applications of numerical methodsApplications of numerical methods
Applications of numerical methodsTarun Gehlot
 

Viewers also liked (15)

ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
 
Basics of Vedic Mathematics - Multiplication (1 of 2)
Basics of Vedic Mathematics - Multiplication (1 of 2)Basics of Vedic Mathematics - Multiplication (1 of 2)
Basics of Vedic Mathematics - Multiplication (1 of 2)
 
Parallel Numerical Methods for Ordinary Differential Equations: a Survey
Parallel Numerical Methods for Ordinary Differential Equations: a SurveyParallel Numerical Methods for Ordinary Differential Equations: a Survey
Parallel Numerical Methods for Ordinary Differential Equations: a Survey
 
The two dimensional wave equation
The two dimensional wave equationThe two dimensional wave equation
The two dimensional wave equation
 
Waves in 2 Dimensions
Waves in 2 DimensionsWaves in 2 Dimensions
Waves in 2 Dimensions
 
Power method
Power methodPower method
Power method
 
Applications of numerical methods
Applications of numerical methodsApplications of numerical methods
Applications of numerical methods
 
2 Dimensional Wave Equation Analytical and Numerical Solution
2 Dimensional Wave Equation Analytical and Numerical Solution2 Dimensional Wave Equation Analytical and Numerical Solution
2 Dimensional Wave Equation Analytical and Numerical Solution
 
Manifold learning
Manifold learningManifold learning
Manifold learning
 
APPLICATION OF NUMERICAL METHODS IN SMALL SIZE
APPLICATION OF NUMERICAL METHODS IN SMALL SIZEAPPLICATION OF NUMERICAL METHODS IN SMALL SIZE
APPLICATION OF NUMERICAL METHODS IN SMALL SIZE
 
Numerical Methods - Power Method for Eigen values
Numerical Methods - Power Method for Eigen valuesNumerical Methods - Power Method for Eigen values
Numerical Methods - Power Method for Eigen values
 
Numerical Analysis (Solution of Non-Linear Equations)
Numerical Analysis (Solution of Non-Linear Equations)Numerical Analysis (Solution of Non-Linear Equations)
Numerical Analysis (Solution of Non-Linear Equations)
 
Application of Numerical method in Real Life
Application of Numerical method in Real LifeApplication of Numerical method in Real Life
Application of Numerical method in Real Life
 
Applications of numerical methods
Applications of numerical methodsApplications of numerical methods
Applications of numerical methods
 
Vedic Mathematics ppt
Vedic Mathematics pptVedic Mathematics ppt
Vedic Mathematics ppt
 

Similar to Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT2016)

Alexei Starobinsky "New results on inflation and pre-inflation in modified gr...
Alexei Starobinsky "New results on inflation and pre-inflation in modified gr...Alexei Starobinsky "New results on inflation and pre-inflation in modified gr...
Alexei Starobinsky "New results on inflation and pre-inflation in modified gr...SEENET-MTP
 
A numerical analysis of three dimensional darcy model in an inclined rect
A numerical analysis of three dimensional darcy model in an inclined rectA numerical analysis of three dimensional darcy model in an inclined rect
A numerical analysis of three dimensional darcy model in an inclined rectIAEME Publication
 
Value Function Geometry and Gradient TD
Value Function Geometry and Gradient TDValue Function Geometry and Gradient TD
Value Function Geometry and Gradient TDAshwin Rao
 
An Intelligent Method for Accelerating the Convergence of Different Versions ...
An Intelligent Method for Accelerating the Convergence of Different Versions ...An Intelligent Method for Accelerating the Convergence of Different Versions ...
An Intelligent Method for Accelerating the Convergence of Different Versions ...aciijournal
 
Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Kai-Wen Zhao
 
Reference velocity sensitivity for the marine internal multiple attenuation a...
Reference velocity sensitivity for the marine internal multiple attenuation a...Reference velocity sensitivity for the marine internal multiple attenuation a...
Reference velocity sensitivity for the marine internal multiple attenuation a...Arthur Weglein
 
An Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series DistributionsAn Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series Distributionsinventionjournals
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerChristian Robert
 
Alexei Starobinsky - Inflation: the present status
Alexei Starobinsky - Inflation: the present statusAlexei Starobinsky - Inflation: the present status
Alexei Starobinsky - Inflation: the present statusSEENET-MTP
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating HyperplaneESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating HyperplaneShinichi Tamura
 
2 borgs
2 borgs2 borgs
2 borgsYandex
 
Stochastic augmentation by generalized minimum variance control with rst loop...
Stochastic augmentation by generalized minimum variance control with rst loop...Stochastic augmentation by generalized minimum variance control with rst loop...
Stochastic augmentation by generalized minimum variance control with rst loop...UFPA
 
Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2Fabian Pedregosa
 
20150304 ims mikiya_fujii_dist
20150304 ims mikiya_fujii_dist20150304 ims mikiya_fujii_dist
20150304 ims mikiya_fujii_distFujii Mikiya
 

Similar to Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT2016) (20)

Alexei Starobinsky "New results on inflation and pre-inflation in modified gr...
Alexei Starobinsky "New results on inflation and pre-inflation in modified gr...Alexei Starobinsky "New results on inflation and pre-inflation in modified gr...
Alexei Starobinsky "New results on inflation and pre-inflation in modified gr...
 
A numerical analysis of three dimensional darcy model in an inclined rect
A numerical analysis of three dimensional darcy model in an inclined rectA numerical analysis of three dimensional darcy model in an inclined rect
A numerical analysis of three dimensional darcy model in an inclined rect
 
Climate Extremes Workshop - A Semiparametric Bayesian Clustering Model for S...
Climate Extremes Workshop -  A Semiparametric Bayesian Clustering Model for S...Climate Extremes Workshop -  A Semiparametric Bayesian Clustering Model for S...
Climate Extremes Workshop - A Semiparametric Bayesian Clustering Model for S...
 
Value Function Geometry and Gradient TD
Value Function Geometry and Gradient TDValue Function Geometry and Gradient TD
Value Function Geometry and Gradient TD
 
20120140506001
2012014050600120120140506001
20120140506001
 
An Intelligent Method for Accelerating the Convergence of Different Versions ...
An Intelligent Method for Accelerating the Convergence of Different Versions ...An Intelligent Method for Accelerating the Convergence of Different Versions ...
An Intelligent Method for Accelerating the Convergence of Different Versions ...
 
Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...Paper Review: An exact mapping between the Variational Renormalization Group ...
Paper Review: An exact mapping between the Variational Renormalization Group ...
 
Reference velocity sensitivity for the marine internal multiple attenuation a...
Reference velocity sensitivity for the marine internal multiple attenuation a...Reference velocity sensitivity for the marine internal multiple attenuation a...
Reference velocity sensitivity for the marine internal multiple attenuation a...
 
An Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series DistributionsAn Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series Distributions
 
Starobinsky astana 2017
Starobinsky astana 2017Starobinsky astana 2017
Starobinsky astana 2017
 
Fourier_Pricing_ICCF_2022.pdf
Fourier_Pricing_ICCF_2022.pdfFourier_Pricing_ICCF_2022.pdf
Fourier_Pricing_ICCF_2022.pdf
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like sampler
 
FPDE presentation
FPDE presentationFPDE presentation
FPDE presentation
 
Alexei Starobinsky - Inflation: the present status
Alexei Starobinsky - Inflation: the present statusAlexei Starobinsky - Inflation: the present status
Alexei Starobinsky - Inflation: the present status
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating HyperplaneESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
ESL 4.4.3-4.5: Logistic Reression (contd.) and Separating Hyperplane
 
2 borgs
2 borgs2 borgs
2 borgs
 
Stochastic augmentation by generalized minimum variance control with rst loop...
Stochastic augmentation by generalized minimum variance control with rst loop...Stochastic augmentation by generalized minimum variance control with rst loop...
Stochastic augmentation by generalized minimum variance control with rst loop...
 
Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2
 
20150304 ims mikiya_fujii_dist
20150304 ims mikiya_fujii_dist20150304 ims mikiya_fujii_dist
20150304 ims mikiya_fujii_dist
 

Recently uploaded

Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 

Recently uploaded (20)

Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 

Riemannian stochastic variance reduced gradient on Grassmann manifold (ICCOPT2016)

  • 1. Riemannian stochastic variance reduced gradient on Grassmann manifold Hiroyuki Kasai†, Hiroyuki Sato§, and Bamdev Mishra†† †The University of Electro-Communications, Japan §Tokyo University of Science, Japan ††Amazon Development Centre India, India August 10, 2016 Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 1
  • 2. Summary (our contributions) Address stochastic gradient descent (SGD) algorithm for empirical risk minimization problem as min w∈Rd n i=1 fi(w). Paritularlly, structured problems on manifolds, i.e., w ∈ M. Propose Riemannian SVRG (R-SVRG). Extend SVRG in the Euclidean into Riemannian manifolds. Give two analyses; Global convergence analysis, and Local convergence rate analysis. Show effectiveness of R-SVRG from numerical comparisons. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 2
  • 3. Stochastic gradient method (SGD) (1) Update in SGD wk = wk−1 current point − αk single gradient for ik-th sample (= stochastic gradient) fik random sample (wk−1) Unbiased expectation of full gradient as E[ fi(w)] = 1 n n i=1 fi(w) = f(w). Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 3
  • 4. Stochastic gradient descent (SGD) (2) Features against full gradient descent (FGD) Pros: High scalability to large-scale data Iteration complexity is independent of n. FGD shows linear complexity in n. Cons: Slow convergence property Decaying stepsizes for convergence to avoid big fluctuations around a solution due to a large step-size. too slow convergence due to a too small step-size. ⇓ Sub-linear convergence rate E[f(wk)] − f(w∗ ) ∈ O(k−1 ). FGD shows f(wk) − f(w∗ ) ∈ O(ck ). Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 4
  • 5. Speeding up of SGD: Variance reduction technique Accelerate the convergence rate of SGD [Mairal, 2015, Roux et al., 2012, Shalev-Shwartz and Zhang, 2012, Shalev-Shwartz and Zhang, 2013, Defazio et al., 2014, Zhang and Xiao, 2014]. Stochastic variance reduced gradient (SVRG) [Johnson and Zhang, 2013] linear convergence rate for strongly-convex functions. Various variants [Garber and Hazan, 2015] analyze the convergence rate for SVRG when f is a convex function that is a sum of non-convex (but smooth) terms. [Shalev-Shwartz, 2015] proposes similar results. [Allen-Zhu and Yan, 2015] further study the same case with better convergence rates. [Shamir, 2015] studies specifically the convergence properties of the variance reduction PCA algorithm. Very recently, [Allen-Zhu and Hazan, 2016] propose a variance reduction method for faster non-convex optimization. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 5
  • 6. Stochastic variance reduced gradient (SVRG) (1) [Johnson and Zhang, 2013] Motivations: Reduce the variance of stochastic gradients. No need to store all gradients not like SAG. But, allow additional calculations of gradients. Basic idea: hybrid algorithm of SGD and FGD. Periodically, calculate and store a full gradient. Every iteration, adjust a stochastic gradient v by the latest full gradient to reduce variance. ⇓ Linear convergence rate E[f( ˜ws)]−E[f( ˜w∗ )]≤αs (E[f( ˜w0)]−E[f( ˜w∗ )]) Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 6
  • 7. Stochastic variance reduced gradient (SVRG) (2) Simplified algorithm of SVRG 1: Initial iterate w0 0 ∈ M. 2: for s = 1, 2, . . . (outer loop) do 3: Store ˜w = ws−1 t . 4: Store f( ˜w). 5: for t = 1, 2, . . . , ms (inner loop) do 6: Calculate modified stochastic gradient vs t = fis t (ws t−1) single gradient at ws t−1 − single gradient fis t ( ˜w)+ f( ˜w). full gradient 7: Update ws t = ws t−1 − αvs t . 8: end for 9: end for Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 7
  • 8. Stochastic variance reduced gradient (SVRG) (3) [Johnson and Zhang, 2013] Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 8
  • 9. Structured problems Examples PCA problem: calculate the projection matrix U to minimize as min U∈St(r,d) 1 n n i=1 xi − UUT xi 2 2, U belongs to Stiefel manifold St(r, d). The set of matrices of size d × r with orthonormal columns, i.e., UT U = I. ⇓ Cost function remains unchanged under the orthogonal group action U → UO for O ∈ O(r). ⇓ U belongs to Grassmann manifold Gr(r, d). The set of r-dimensional linear subspaces in Rd with orthonormal columns, i.e., UT U = I. Other examples (not exchasted) matrix completion, subspace tracking, spectral clustering, CCA, bi-factor regression, ....Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 9
  • 10. Optimization on Riemannian manifolds [Absil et al., 2008] If constraints can be defined by a manifold, the constrained problem is viewed as unconstrained problem on the manifold as min w∈Rn f(w), s.t. ci(w) = 0, cj(w) ≤ 0 ⇓ min w∈M f(w), M : Riemannian manifold Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 10
  • 11. Riemannian SGD (R-SGD) (1) [Bonnabel, 2013] Extension of Euclidean SGD into Riemannian manifolds. Update in R-SGD wk = Move along geodesic (by exponential mapping) Expwk−1 (−αk gradfik (wk−1) Riemannian stochastic gradient ) 1. Calculate a Riemannian stochastic gradient gradfik (wk−1) for the sample ik at wk−1. 2. Then, move along the geodesic from wk−1 in the direction of gradfik (wk−1). Geodesic is generalization of straight lines in Euclidean space. Exponential mapping Expw(·) specifies the geodesic. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 11
  • 12. Riemannian SGD (R-SGD) (2) [Bonnabel, 2013] Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 12
  • 13. Proposal: Riemannian SVRG (R-SVRG) [Kasai et al., 2016] Propose a novel extension of SVRG in the Euclidean space to the Riemannian manifold search space. Extension is not trivial. Focus on the Grassmann manifold Gr(r, d). Can be generalized to other compact Riemannian manifolds. Notations SVRG R-SVRG Model parameter ws t−1 ∈ Rn Us t−1 ∈ Gr(r, d) Edge point of outer loop ˜w ∈ Rn ˜U ∈ Gr(r, d) Stochastic gradient fis t (ws t−1) ∈ Rn gradfis t (Us t−1) ∈ TUs t−1 Gr(r, d) Modified stochastic vs t ∈ Rn ξs t ∈ TUs t−1 Gr(r, d) gradient Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 13
  • 14. Proposal: Riemannian SVRG (R-SVRG) Algorithm Straightforward modification of stochastic gradient Extend SVRG case: vs t = fis t (ws t−1) − fit ( ˜w) + f( ˜w). ξs t = gradfis t (Us t−1) − gradfis t (˜U) + gradf(˜U) Meaningless because manifolds are not vector space. ⇓ Proposed modification Transport vectors at ˜U into the current tangent space at Us t−1 by parallel translation, then add them. ξs t = gradfis t (Us t−1) + parallel−translation operator P U s t−1←˜U γ geodesic −gradfis t (˜U) + gradf(˜U) Logarithm mapping gives the tangent vector for geodesic γ. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 14
  • 15. Proposal: Riemannian SVRG (R-SVRG) Conceptual illustration Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 15
  • 16. Tools in Grassmann manifold Exponential mapping in the direction of ξ ∈ TU(0) U(t) = [U(0)V W]   cos tΣ sin tΣ   VT , ξ = WΣVT is the rank-r singular value decomposition of ξ. cos(·) and sin(·) operations are only on the diagonal entries. Parallel translation of ζ ∈ TU(0) along γ(t) with ξ ζ(t) =  [U(0)V W]   − sin tΣ cos tΣ   WT + (I − WWT )   ζ. Logarithm mapping of U(t) at U(0) ξ = logU(0)(U(t)) = W arctan(Σ)VT , WΣVT is the rank-r singular value decomposition of (U(t) − U(0)U(0)T U(t))(U(0)T U(t))−1 . Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 16
  • 17. Main results: convergence analyses Global convergence analysis with decaying step-sizes. Guarantee that the iteration globally converges to a critical point starting from any initialization point. Local convergence rate analysis under fixed step-size. Consider the rate in neighborhood of a local minimum. Assume that Lipschitz smoothness and lower bound of Hessian hold only in this neighborhood. Obtain local linear convergence rate as E[(dist(˜U s , U∗ ))2 ] ≤ 4(1 + 8mα2 β2 ) αm(σ − 14ηβ2) E[(dist(˜U s−1 , U∗ ))2 ]. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 17
  • 18. Proof sketch for local convergence rate 1. Obtain below by assuming the smallest eigenvalue σ of Hessian of f as f(z) ≥ f(w) + Exp−1 w (z), gradf(w) w + σ 2 Exp−1 w (z) 2 w, w, z ∈ U. (1) 2. Obtain the variance of ξs t from β-Lipschitz continuity as Eis t [ ξs t 2 ] ≤ β2 (14(dist(ws t−1, w∗ ))2 + 8dist( ˜ws−1 , w∗ ))2 ) (2) 3. Obtain the expectation of the decrease of the distance to the solution in the inner iteration from the lemma for a geodesic triangle in an Alexandrov space as Eis t (dist(Us t , U∗ ))2 − (dist(Us t−1, U∗ ))2 ≤ Eis t [(dist(Us t−1, Us t ))2 + 2η gradf(Us t−1), Exp−1 Us t−1 (U∗ ) Us t−1 ]. (3) 4. Putting (1)&(2) into (3) with summing over the inner loop finally yields the decrease of the distance to the solution in the outer iteration. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 18
  • 19. Numerical comparisons Experiments conditions Compare R-SVRG with 1. R-SGD 2. R-SD (steepest descent) with backtracking line search Step-size algorithms 1. fixed step-size 2. decaying step-sizes 3. hybrid step-sizes Use the decaying step-sizes at less than sT H (= 5) epoch, and subsequently switches to a fixed step-size. PCA problem n = 10000, d = 20, and r = 5. Evaluation metrics Optimality gap Distance to the minimum loss obtained by Matlab pca. Norm of gradient Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 19
  • 20. Numerical comparisons Results for PCA problem #grad/N 0 50 100 150 200 250 Trainloss-optimum 10-10 10 -5 100 R-SD R-SGD : decay (η=0.009, λ=0.1) R-SVRG : fix (η=0.001) R-SVRG : decay (η=0.001, λ=0.001) R-SVRG : hybrid (η=0.004, λ=0.01) R-SVRG+ : fix (η=0.001) R-SVRG+ : decay (η=0.002, λ=0.01) R-SVRG+ : hybrid (η=0.002, λ=0.01) (a) Optimality gap. #grad/N 0 50 100 150 200 250 Normofgradient 10-5 10 0 R-SD R-SGD : decay (η=0.009, λ=0.1) R-SVRG : fix (η=0.001) R-SVRG : decay (η=0.001, λ=0.001) R-SVRG : hybrid (η=0.004, λ=0.01) R-SVRG+ : fix (η=0.001) R-SVRG+ : decay (η=0.002, λ=0.01) R-SVRG+ : hybrid (η=0.002, λ=0.01) (b) Norm of gradient. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 20
  • 21. Conclusions and more information Conclusions Propose Riemannian SVRG (R-SVRG). R-SVRG shows local linear convergence rate. Numerical comparisons shows the effectiveness of the algorithm. More information Full paper H.Kasai, H.Sato and B.Mishra, ”Riemannian stochastic variance reduced gradient on Grassmann manifold,” arXiv:1605.07367, May 2016, [Kasai et al., 2016] Matlab code https://bamdevmishra.com/codes/rsvrg/ Thank you for your attention. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 21
  • 22. References I Absil, P.-A., Mahony, R., and Sepulchre, R. (2008). Optimization Algorithms on Matrix Manifolds. Princeton University Press. Allen-Zhu, Z. and Hazan, E. (2016). Variance reduction for faster non-convex optimization. Technical report, arXiv preprint arXiv:1603.05643. Allen-Zhu, Z. and Yan, Y. (2015). Improved SVRG for non-strongly-convex or sum-of-non-convex objectives. Technical report, arXiv preprint arXiv:1506.01972. Bonnabel, S. (2013). Stochastic gradient descent on Riemannian manifolds. IEEE Trans. on Automatic Control, 58(9):2217–2229. Defazio, A., Bach, F., and Lacoste-Julien, S. (2014). SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In NIPS. Garber, D. and Hazan, E. (2015). Fast and simple PCA via convex optimization. Technical report, arXiv preprint arXiv:1509.05647. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 22
  • 23. References II Johnson, R. and Zhang, T. (2013). Accelerating stochastic gradient descent using predictive variance reduction. In NIPS, pages 315–323. Kasai, H., Sato, H., and Mishra, B. (2016). Riemannian stochastic variance reduced gradient on grassmann manifold. arXiv preprint: arXiv:1605.07367. Mairal, J. (2015). Incremental majorization-minimization optimization with application to largescale machine learning. SIAM J. Optim., 25(2):829–855. Roux, N. L., Schmidt, M., and Bach, F. R. (2012). A stochastic gradient method with an exponential convergence rate for finite training sets. In NIPS, pages 2663–2671. Shalev-Shwartz, S. (2015). SDCA without duality. Technical report, arXiv preprint arXiv:1502.06177. Shalev-Shwartz, S. and Zhang, T. (2012). Proximal stochastic dual coordinate ascent. Technical report, arXiv preprint arXiv:1211.2717. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 23
  • 24. References III Shalev-Shwartz, S. and Zhang, T. (2013). Stochastic dual coordinate ascent methods for regularized loss minimization. JMRL, 14:567–599. Shamir, O. (2015). Fast stochastic algorithms for SVD and PCA: Convergence properties and convexity. Technical report, arXiv preprint arXiv:1507.08788. Zhang, Y. and Xiao, L. (2014). Stochastic primal-dual coordinate method for regularized empirical risk minimization. SIAM J. Optim., 24(4):2057–2075. Riemannian stochastic variance reduced gradient on Grassmann manifold (all copyrights owned by Kasai, Sato, and Mishra) 24