SlideShare a Scribd company logo
1 of 14
Download to read offline
Coordinate descent optimization in Recommendation
system
Xudong Sun,sun@aisbi.de
DSOR-AISBI
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 1 / 14
Outline
1 Introduction
2 Case Study
3 Reference
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 2 / 14
Introduction
Recapturing the mathematical model behind
recommendation system
SVD with regularization
Maximum a posterior
min
x∗,y∗
u,i (rui − xT
u yi )2
+ λ( u ||x2
u || + i ||y2
i ||)
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 3 / 14
Introduction
Background
Drawbacks of Gradient Descent algorithm
Appropriate learning rate is hard to choose
Slow convergence with asymptotic behavior
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 4 / 14
Introduction
Basic idea of Coordinate descent
optimize each coordinate(dimension) sequentially to decrease the
objective f (x∗ + d ˙ei ) >= f (x∗)
iterate until the result converge
Question: Will it work?
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 5 / 14
Introduction
Preliminaries
What is a convex set ? What properties a convect set has?
Do you know how to compute derivative in Matrix algebra?
is CD equivalent to SGD? [f (x∗ + d ˙ei ) >= f (x∗)] ≡ [f (x∗) = min]?
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 6 / 14
Introduction
How it works exactly
suppose we want to optimize in the k th iteration using the result of k-1 th
x
(k)
1
:= argmin
x1
f (x1, x
(k−1)
2
, x
(k−1)
3
, ....)
x
(k)
2
:= argmin
x2
f (x
(k)
1
, x2, x
(k−1)
3
, ....)
...
x
(k)
n := argmin
xn
f (x
(k)
1
, x
(k)
2
, x
(k)
3
, ....)
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 7 / 14
Case Study
Algorithms
Linear regression
Lasso
SVM(SMO and DCD with python implementation)
basic matrix factorization
WRMF
Factorization machine
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 8 / 14
Case Study
Preliminaries:matrix dierentiation
convention of derivative of a vector with respect to a vector: keep the
orientation of the denominator vector!
∂y
∂x =
∂y1
∂x1
∂y2
∂x1
· · · ∂ym
∂x1
... ... ... ...
∂y1
∂xn
∂y2
∂xn
· · · ∂ym
∂xn
,∂yT
∂x =
∂y1
∂x1
∂y1
∂x2
· · · ∂y1
∂xn
∂y2
∂x1
∂y2
∂x2
· · · ∂y2
∂xn
... ... ... ...
∂ym
∂x1
∂ym
∂x2
· · · ∂ym
∂xn
∂Ax
∂x = AT , [Ax]i = ai,j xj , [∂Ax
∂x ]k,i = ∂[Ax]i
∂xk
= ai,k
∂xT Ax
∂x = Ax + AT x, xT Ax = n
i=1
n
j=1
aij xi xj ,
∂xT Ax
∂xk
=
∂ aij xi xj
∂xk
= ak,j xj + ai,kxi , A must be square but not
necessarily symmetric
∂xT x
∂x = ∂xT Ix
∂x = 2x
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 9 / 14
Case Study
Linear regression
f (x) = argmin
x
1
2
||y − Ax||2
0 = ∂f (x)
∂xi
= AT [i, :](y − Ax)(−1) = AT [i, :](A[:, i]xi + A[:, −i]x−i − y)
here A[:, −i] means all columns of A except for the index i,
x∗
i =
AT [i,:](y−A[:,−i]x−i )
AT [i,:]A[:,i]
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 10 / 14
Case Study
Algorithm for CD in Linear regression
question: why can't we directly solve for xi inside the bracket? (Recall
Abstract algebra or data structure or algorithm course)
Input: Design matrix A
Input: target variable for each training sample y
Output: coecient for each linear variable
while cycleMaxCycle do
x∗
i =
AT [i,:](y−A[:,−i]x−i )
AT [i,:]A[:,i]
end
Algorithm 1: CD for linear regression
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 11 / 14
Case Study
Simple coordinate descent for Collaborative ltering
Model:User latent feature Ui = [Ui1, Ui2, ..., Uin], Movie latent feature
Vj = [Vj1, Uj2, ..., Vjn]
Objective function ||R − UT V ||2
+ λ(||U||2
+ ||V ||2
)
||R[i, :] − U[:, i]T V ||2
+ ||R[−i, :] − U[−i, :]T V ||2
+ λ(||U||2
+ ||V ||2
)
Grad2Ui = (−2V )(R[i, :] − U[:, i]T V )T + 2λU[:, i] = 0,
U[:, i] = (VV T + λI)−1
VR[i, :]T
||R[:, j] − UT V [:, j]||2
+ ||R[:, −j] − UT V [:, −i]||2
+ λ(||U||2
+ ||V ||2
)
Grad2Vj = (−2U)(R[:, j] − UT V [:, j]) + 2λV [:, j] = 0
UR[:, j] = (UUT + λI)V [:, j]
V [:, j] = (UUT + λI)−1
UR[:, j]
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 12 / 14
Case Study
WRMF coordinate descent algorithm: ALS
objective function min
x∗,y∗
u,i cui (pui − xT
u yi )2
+ λ( u ||x2
u || + i ||y2
i ||)
Alternating Least Squares solution (one iteration)
xu = (Y T CuY + λIf ×f )−1
Y T Cup(u)
yu = (XX Ci X + λIf ×f )−1
XT Ci p(i)
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 13 / 14
Reference
Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 14 / 14

More Related Content

What's hot

Lecture 4: Stochastic Hydrology (Site Characterization)
Lecture 4: Stochastic Hydrology (Site Characterization)Lecture 4: Stochastic Hydrology (Site Characterization)
Lecture 4: Stochastic Hydrology (Site Characterization)Amro Elfeki
 
Solovay Kitaev theorem
Solovay Kitaev theoremSolovay Kitaev theorem
Solovay Kitaev theoremJamesMa54
 
Lecture 5: Stochastic Hydrology
Lecture 5: Stochastic Hydrology Lecture 5: Stochastic Hydrology
Lecture 5: Stochastic Hydrology Amro Elfeki
 
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...Amro Elfeki
 
MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化Akira Tanimoto
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
 
A brief survey of tensors
A brief survey of tensorsA brief survey of tensors
A brief survey of tensorsBerton Earnshaw
 
Stochastic Hydrology Lecture 1: Introduction
Stochastic Hydrology Lecture 1: Introduction Stochastic Hydrology Lecture 1: Introduction
Stochastic Hydrology Lecture 1: Introduction Amro Elfeki
 
Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)Pei-Che Chang
 
Lecture 2: Stochastic Hydrology
Lecture 2: Stochastic Hydrology Lecture 2: Stochastic Hydrology
Lecture 2: Stochastic Hydrology Amro Elfeki
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientFabian Pedregosa
 
Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
Optimal Budget Allocation: Theoretical Guarantee and Efficient AlgorithmOptimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
Optimal Budget Allocation: Theoretical Guarantee and Efficient AlgorithmTasuku Soma
 
Maximizing Submodular Function over the Integer Lattice
Maximizing Submodular Function over the Integer LatticeMaximizing Submodular Function over the Integer Lattice
Maximizing Submodular Function over the Integer LatticeTasuku Soma
 
Lecture 3: Stochastic Hydrology
Lecture 3: Stochastic HydrologyLecture 3: Stochastic Hydrology
Lecture 3: Stochastic HydrologyAmro Elfeki
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Tomoya Murata
 
Pseudo Random Number Generators
Pseudo Random Number GeneratorsPseudo Random Number Generators
Pseudo Random Number GeneratorsDarshini Parikh
 
Writing your own Neural Network.
Writing your own Neural Network.Writing your own Neural Network.
Writing your own Neural Network.shafkatdu9212
 

What's hot (20)

Lecture 4: Stochastic Hydrology (Site Characterization)
Lecture 4: Stochastic Hydrology (Site Characterization)Lecture 4: Stochastic Hydrology (Site Characterization)
Lecture 4: Stochastic Hydrology (Site Characterization)
 
Solovay Kitaev theorem
Solovay Kitaev theoremSolovay Kitaev theorem
Solovay Kitaev theorem
 
Lecture 5: Stochastic Hydrology
Lecture 5: Stochastic Hydrology Lecture 5: Stochastic Hydrology
Lecture 5: Stochastic Hydrology
 
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
Lecture 6: Stochastic Hydrology (Estimation Problem-Kriging-, Conditional Sim...
 
MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
A brief survey of tensors
A brief survey of tensorsA brief survey of tensors
A brief survey of tensors
 
Stochastic Hydrology Lecture 1: Introduction
Stochastic Hydrology Lecture 1: Introduction Stochastic Hydrology Lecture 1: Introduction
Stochastic Hydrology Lecture 1: Introduction
 
Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)
 
Lecture 2: Stochastic Hydrology
Lecture 2: Stochastic Hydrology Lecture 2: Stochastic Hydrology
Lecture 2: Stochastic Hydrology
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradient
 
Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
Optimal Budget Allocation: Theoretical Guarantee and Efficient AlgorithmOptimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
 
Maximizing Submodular Function over the Integer Lattice
Maximizing Submodular Function over the Integer LatticeMaximizing Submodular Function over the Integer Lattice
Maximizing Submodular Function over the Integer Lattice
 
Lecture 3: Stochastic Hydrology
Lecture 3: Stochastic HydrologyLecture 3: Stochastic Hydrology
Lecture 3: Stochastic Hydrology
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 
DSP 05 _ Sheet Five
DSP 05 _ Sheet FiveDSP 05 _ Sheet Five
DSP 05 _ Sheet Five
 
Pseudo Random Number Generators
Pseudo Random Number GeneratorsPseudo Random Number Generators
Pseudo Random Number Generators
 
cheb_conf_aksenov.pdf
cheb_conf_aksenov.pdfcheb_conf_aksenov.pdf
cheb_conf_aksenov.pdf
 
Writing your own Neural Network.
Writing your own Neural Network.Writing your own Neural Network.
Writing your own Neural Network.
 
Lecture9 xing
Lecture9 xingLecture9 xing
Lecture9 xing
 

Similar to Lecture note4coordinatedescent

Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsUniversity of Glasgow
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsDmitriy Selivanov
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfJunghyun Lee
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersTaiji Suzuki
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackarogozhnikov
 
Hybrid dynamics in large-scale logistics networks
Hybrid dynamics in large-scale logistics networksHybrid dynamics in large-scale logistics networks
Hybrid dynamics in large-scale logistics networksMKosmykov
 
Sampled-Data Piecewise Affine Slab Systems: A Time-Delay Approach
Sampled-Data Piecewise Affine Slab Systems: A Time-Delay ApproachSampled-Data Piecewise Affine Slab Systems: A Time-Delay Approach
Sampled-Data Piecewise Affine Slab Systems: A Time-Delay ApproachBehzad Samadi
 
2017-03, ICASSP, Projection-based Dual Averaging for Stochastic Sparse Optimi...
2017-03, ICASSP, Projection-based Dual Averaging for Stochastic Sparse Optimi...2017-03, ICASSP, Projection-based Dual Averaging for Stochastic Sparse Optimi...
2017-03, ICASSP, Projection-based Dual Averaging for Stochastic Sparse Optimi...asahiushio1
 
Nonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodNonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodTasuku Soma
 
Improving EV Lateral Dynamics Control Using Infinity Norm Approach with Close...
Improving EV Lateral Dynamics Control Using Infinity Norm Approach with Close...Improving EV Lateral Dynamics Control Using Infinity Norm Approach with Close...
Improving EV Lateral Dynamics Control Using Infinity Norm Approach with Close...Valerio Salvucci
 
Phase Retrieval: Motivation and Techniques
Phase Retrieval: Motivation and TechniquesPhase Retrieval: Motivation and Techniques
Phase Retrieval: Motivation and TechniquesVaibhav Dixit
 
Visual Impression Localization of Autonomous Robots_#CASE2015
Visual Impression Localization of Autonomous Robots_#CASE2015Visual Impression Localization of Autonomous Robots_#CASE2015
Visual Impression Localization of Autonomous Robots_#CASE2015Soma Boubou
 
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...Yuko Kuroki (黒木祐子)
 
Backstepping for Piecewise Affine Systems: A SOS Approach
Backstepping for Piecewise Affine Systems: A SOS ApproachBackstepping for Piecewise Affine Systems: A SOS Approach
Backstepping for Piecewise Affine Systems: A SOS ApproachBehzad Samadi
 
engineeringmathematics-iv_unit-ii
engineeringmathematics-iv_unit-iiengineeringmathematics-iv_unit-ii
engineeringmathematics-iv_unit-iiKundan Kumar
 
Engineering Mathematics-IV_B.Tech_Semester-IV_Unit-II
Engineering Mathematics-IV_B.Tech_Semester-IV_Unit-IIEngineering Mathematics-IV_B.Tech_Semester-IV_Unit-II
Engineering Mathematics-IV_B.Tech_Semester-IV_Unit-IIRai University
 

Similar to Lecture note4coordinatedescent (20)

Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender Systems
 
block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdf
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
 
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 
Hybrid dynamics in large-scale logistics networks
Hybrid dynamics in large-scale logistics networksHybrid dynamics in large-scale logistics networks
Hybrid dynamics in large-scale logistics networks
 
Sampled-Data Piecewise Affine Slab Systems: A Time-Delay Approach
Sampled-Data Piecewise Affine Slab Systems: A Time-Delay ApproachSampled-Data Piecewise Affine Slab Systems: A Time-Delay Approach
Sampled-Data Piecewise Affine Slab Systems: A Time-Delay Approach
 
WITMSE 2013
WITMSE 2013WITMSE 2013
WITMSE 2013
 
2017-03, ICASSP, Projection-based Dual Averaging for Stochastic Sparse Optimi...
2017-03, ICASSP, Projection-based Dual Averaging for Stochastic Sparse Optimi...2017-03, ICASSP, Projection-based Dual Averaging for Stochastic Sparse Optimi...
2017-03, ICASSP, Projection-based Dual Averaging for Stochastic Sparse Optimi...
 
Nonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares MethodNonconvex Compressed Sensing with the Sum-of-Squares Method
Nonconvex Compressed Sensing with the Sum-of-Squares Method
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
Improving EV Lateral Dynamics Control Using Infinity Norm Approach with Close...
Improving EV Lateral Dynamics Control Using Infinity Norm Approach with Close...Improving EV Lateral Dynamics Control Using Infinity Norm Approach with Close...
Improving EV Lateral Dynamics Control Using Infinity Norm Approach with Close...
 
Phase Retrieval: Motivation and Techniques
Phase Retrieval: Motivation and TechniquesPhase Retrieval: Motivation and Techniques
Phase Retrieval: Motivation and Techniques
 
Visual Impression Localization of Autonomous Robots_#CASE2015
Visual Impression Localization of Autonomous Robots_#CASE2015Visual Impression Localization of Autonomous Robots_#CASE2015
Visual Impression Localization of Autonomous Robots_#CASE2015
 
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
 
Backstepping for Piecewise Affine Systems: A SOS Approach
Backstepping for Piecewise Affine Systems: A SOS ApproachBackstepping for Piecewise Affine Systems: A SOS Approach
Backstepping for Piecewise Affine Systems: A SOS Approach
 
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
 
engineeringmathematics-iv_unit-ii
engineeringmathematics-iv_unit-iiengineeringmathematics-iv_unit-ii
engineeringmathematics-iv_unit-ii
 
Engineering Mathematics-IV_B.Tech_Semester-IV_Unit-II
Engineering Mathematics-IV_B.Tech_Semester-IV_Unit-IIEngineering Mathematics-IV_B.Tech_Semester-IV_Unit-II
Engineering Mathematics-IV_B.Tech_Semester-IV_Unit-II
 

Recently uploaded

Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 

Recently uploaded (20)

9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 

Lecture note4coordinatedescent

  • 1. Coordinate descent optimization in Recommendation system Xudong Sun,sun@aisbi.de DSOR-AISBI Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 1 / 14
  • 2. Outline 1 Introduction 2 Case Study 3 Reference Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 2 / 14
  • 3. Introduction Recapturing the mathematical model behind recommendation system SVD with regularization Maximum a posterior min x∗,y∗ u,i (rui − xT u yi )2 + λ( u ||x2 u || + i ||y2 i ||) Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 3 / 14
  • 4. Introduction Background Drawbacks of Gradient Descent algorithm Appropriate learning rate is hard to choose Slow convergence with asymptotic behavior Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 4 / 14
  • 5. Introduction Basic idea of Coordinate descent optimize each coordinate(dimension) sequentially to decrease the objective f (x∗ + d ˙ei ) >= f (x∗) iterate until the result converge Question: Will it work? Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 5 / 14
  • 6. Introduction Preliminaries What is a convex set ? What properties a convect set has? Do you know how to compute derivative in Matrix algebra? is CD equivalent to SGD? [f (x∗ + d ˙ei ) >= f (x∗)] ≡ [f (x∗) = min]? Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 6 / 14
  • 7. Introduction How it works exactly suppose we want to optimize in the k th iteration using the result of k-1 th x (k) 1 := argmin x1 f (x1, x (k−1) 2 , x (k−1) 3 , ....) x (k) 2 := argmin x2 f (x (k) 1 , x2, x (k−1) 3 , ....) ... x (k) n := argmin xn f (x (k) 1 , x (k) 2 , x (k) 3 , ....) Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 7 / 14
  • 8. Case Study Algorithms Linear regression Lasso SVM(SMO and DCD with python implementation) basic matrix factorization WRMF Factorization machine Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 8 / 14
  • 9. Case Study Preliminaries:matrix dierentiation convention of derivative of a vector with respect to a vector: keep the orientation of the denominator vector! ∂y ∂x = ∂y1 ∂x1 ∂y2 ∂x1 · · · ∂ym ∂x1 ... ... ... ... ∂y1 ∂xn ∂y2 ∂xn · · · ∂ym ∂xn ,∂yT ∂x = ∂y1 ∂x1 ∂y1 ∂x2 · · · ∂y1 ∂xn ∂y2 ∂x1 ∂y2 ∂x2 · · · ∂y2 ∂xn ... ... ... ... ∂ym ∂x1 ∂ym ∂x2 · · · ∂ym ∂xn ∂Ax ∂x = AT , [Ax]i = ai,j xj , [∂Ax ∂x ]k,i = ∂[Ax]i ∂xk = ai,k ∂xT Ax ∂x = Ax + AT x, xT Ax = n i=1 n j=1 aij xi xj , ∂xT Ax ∂xk = ∂ aij xi xj ∂xk = ak,j xj + ai,kxi , A must be square but not necessarily symmetric ∂xT x ∂x = ∂xT Ix ∂x = 2x Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 9 / 14
  • 10. Case Study Linear regression f (x) = argmin x 1 2 ||y − Ax||2 0 = ∂f (x) ∂xi = AT [i, :](y − Ax)(−1) = AT [i, :](A[:, i]xi + A[:, −i]x−i − y) here A[:, −i] means all columns of A except for the index i, x∗ i = AT [i,:](y−A[:,−i]x−i ) AT [i,:]A[:,i] Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 10 / 14
  • 11. Case Study Algorithm for CD in Linear regression question: why can't we directly solve for xi inside the bracket? (Recall Abstract algebra or data structure or algorithm course) Input: Design matrix A Input: target variable for each training sample y Output: coecient for each linear variable while cycleMaxCycle do x∗ i = AT [i,:](y−A[:,−i]x−i ) AT [i,:]A[:,i] end Algorithm 1: CD for linear regression Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 11 / 14
  • 12. Case Study Simple coordinate descent for Collaborative ltering Model:User latent feature Ui = [Ui1, Ui2, ..., Uin], Movie latent feature Vj = [Vj1, Uj2, ..., Vjn] Objective function ||R − UT V ||2 + λ(||U||2 + ||V ||2 ) ||R[i, :] − U[:, i]T V ||2 + ||R[−i, :] − U[−i, :]T V ||2 + λ(||U||2 + ||V ||2 ) Grad2Ui = (−2V )(R[i, :] − U[:, i]T V )T + 2λU[:, i] = 0, U[:, i] = (VV T + λI)−1 VR[i, :]T ||R[:, j] − UT V [:, j]||2 + ||R[:, −j] − UT V [:, −i]||2 + λ(||U||2 + ||V ||2 ) Grad2Vj = (−2U)(R[:, j] − UT V [:, j]) + 2λV [:, j] = 0 UR[:, j] = (UUT + λI)V [:, j] V [:, j] = (UUT + λI)−1 UR[:, j] Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 12 / 14
  • 13. Case Study WRMF coordinate descent algorithm: ALS objective function min x∗,y∗ u,i cui (pui − xT u yi )2 + λ( u ||x2 u || + i ||y2 i ||) Alternating Least Squares solution (one iteration) xu = (Y T CuY + λIf ×f )−1 Y T Cup(u) yu = (XX Ci X + λIf ×f )−1 XT Ci p(i) Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 13 / 14
  • 14. Reference Xudong Sun,sun@aisbi.de (DSOR-AISBI)Coordinate descent optimization in Recommendation system 14 / 14