1. QUADRATIC PROGRAMMING SOLVER FOR
NON-NEGATIVE MATRIX FACTORIZATION
WITH SPARK
DebasishDas SantanuDas
debasish.das@verizon.com santanu.das@verizon.com
BigDataAnalytics
Verizon
2. ROADMAP
Some overview on Matrix Factorization
QP formulation of Non-negative Matrix Factorization (NMF)
Algorithms to solve quadratic programming problems
3. ROADMAP
Some overview on Matrix Factorization
QP formulation of Non-negative Matrix Factorization (NMF)
Algorithms to solve quadratic programming problems
Some QP Applications (on MovieLens data)
unconstrained
linear constrained
Results & Discussions
4. MATRIX FACTORIZATION
What is it- To decompose observed data (R rating matrix):
User factors matrix ( )
Movie factor matrix ( )
H
W
Solve for &W H
D(R|W, H) = ||R − WH| +
1
2
|
2
F
||W|| + ||W| + |H||| + ||H|αl1w αl2w |
2
F
λl1h λl2h |
2
F
5. REGULARIZED ALTERNATING LEAST SQUARE (RALS)
Fixed-point RALS Algorithm: Equating gradient to zero, obtain
iterative update scheme of ,
estimate , given (innerloop:tolerancebasedsolutionsareprovided)
enforce positivity (NMF)
REPEAT until convergence (outerloop:factormatricesareupdateduntilconvergence)
Current effort aims to introduce exibility to impose additional
constraints (e.g. bounds on variables, sparsity, etc.)
W H
H W
6. NMF - ESSENTIALLY CLUSTERING
Solve independent problems:
Aggregated solutions:
Our contribution: given , we compute cluster possibilities ( )
using a QP solver in Spark Mllib. (Note:Thisisinnerloop)
D(R|W, H) = || − W |
1
2
∑n
i=1
ri hi |
2
2
n
mi || − W |n ≥0hi
1
2
ri hi |
2
2
H = [ , , , … ]h1 h2 h3 hn
W H
7. QP TO ADMM/SOCP
ADMM : solves by decomposing a hard problem into simpler
yet efficiently solvable sub-problems and let them achieve
consensus.
8. QP TO ADMM/SOCP
ADMM : solves by decomposing a hard problem into simpler
yet efficiently solvable sub-problems and let them achieve
consensus.
ECOS: solves a speci c class of problems that can be
formulated as a second-order cone program (SOCP) using
primal-dual interior-point method.
9. QP TO ADMM/SOCP
Use: Our priliminary investigation shows that ADMM solves
certain class of problems (e.g. bounds, l1 minimization) much
faster than ECOS while the later proves effective in handling
relatively complicated constraints (e.g. equality constraints).
10. QP: ADMM FORMULATION
Objective
Constraints
ADMM formulation
ADMM Steps
f (h) : 0.5||r − Wh| => 0.5 (W )h − ( W)h|
2
2
h
T
W
T
r
T
g(z) : z >= 0
f (h) + g(z)
s.t h = z
= argmi f (h + 0.5 × ρ||h − + | )h
k+1
nh z
k
u
k
|
2
2
= + s.t ∈ g(z)z
k+1
h
k+1
u
k
z
k+1
= + −u
k+1
u
k
hk+1 zk+1
13. QP: SOCP FORMULATION
Objective transformation
Constraints
Quadratic constraint transformation
minimize t
s.t 0.5 (W )h − ( W)h ≤ th
T
W
T
r
T
h >= 0
Aeq × h = Beq
A × h ≤ B
≤ d
∣
∣
∣
hQchol
c
∣
∣
∣
18. USE CASE: SPARSITY
Application: signal recovery problems in which the original signal
is known to have a sparse representation
A test to compute Non-Sparse Coef cients (> 1e-4) & RMSE
OCTAVE ALS ECOS ADMM
Non-Sparse Coef cients: 16 20 18 18
RMSE: N.A. 9e-2 2e-2 2e-2
21. USE CASE: EQUALITY WITH BOUND
Application: Support Vector Machines based models with sparse
weight representation or portfolio optimization
A test to compute Sum, Non-Sparse Coef cients (> 1e-4) &
RMSE
OCTAVE ALS ECOS ADMM
Sum of Coef cients: 1 - 0.99 1
Non-Sparse Coef cients: 4 - 4 4
RMSE: N.A. 1.1 2e-4 5.5e-5
22. USE CASE: EQUALITY WITH BOUNDS
//Equalityconstraintx1+x2+...+xr=1
//Boundconstraintis0≤x≤1
valequalityBuilder=newCSCMatrix.Builder[Double](1,rank)
for(i<-untilrank)equalityBuilder.add(0,i,1)
valqpSolverEquality=
newQpSolver(rank,0,false,
Some(equalityBuilder.result),None,true,true)
qpSolverEquality.updateUb(Array.fill[Double](rank)(1.0))
qpSolverEquality.updateEquality(Array[Double](1.0))
23. USE CASE: EQUALITY WITH BOUNDS
valfactors=Array.range(0,numUsers).map{index=>
fillFullMatrix(userXtX(index),fullXtX)
valH=fullXtX.add(YtY.get.value)
valf=userXy(index).mul(-1)
valqpEqualityResult=qpSolverEquality.solve(H,f)
qpEqualityResult
}
30. FUTURE WORK
Release QpSolver-Spark after rigorous testing
Runtime optimizations for QpSolver-Spark integration
Release ECOS based LP and SOCP solvers based on
community feedback
Iterative Quadratic Minimizer with ADMM
Constrained Convex Minimizer with ADMM
32. JOIN US AND MAKE MACHINES SMARTER
References
ECOS by Domahidi et al. https://github.com/ifa-ethz/ecos
ADMM by Boyd et al.
http://web.stanford.edu/~boyd/papers/admm/
Proximal Algorithms by Neal Parikh and Professor Boyd