SlideShare a Scribd company logo
Distributed SVM
Harsha Vardhan
IIT Gandhinagar
harsha.tetali@iitgn.ac.in
April 30, 2017
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 1 / 21
Overview
1 Alternating Direction Method of Multipliers
Objective
The Lagrangian Dual
Formulation of ADMM
2 Distributed SVM
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 2 / 21
ADMM
Objective
min
x∈Rn,z∈Rm
f (x) + g(z) (1)
subject to Ax + Bz = c
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 3 / 21
ADMM-Lagrangian
Lagrangian without the Penalty Term
Lρ(x, z, λ) = f (x) + g(z) + λ (Ax + Bz − c) (2)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 4 / 21
ADMM-Lagrangian
Lagrangian with the Penalty Term
Lρ(x, z, λ) = f (x) + g(z) + λ (Ax + Bz − c) +
ρ
2
Ax + Bz − c 2
(3)
ρ > 0 is called the Augmented Lagrangian Parameter. This Lagrangian
with added penalty term is also called the Augmented Lagrangian.
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 5 / 21
ADMM
Formulation
We need the following:
p∗ = Inf{f (x) + g(z)|Ax + Bz = c} (4)
We have the dual problem formulated as:
g(λ) = inf
x,z
f (x) + g(z) + λ (Ax + Bz − c) +
ρ
2
Ax + Bz − c 2
(5)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 6 / 21
ADMM
Formulation
Assuming that the saddle point of Lρ(x, z, λ) exists and that we have
strong duality, we can write:
p∗ = d∗ = sup
λ
inf
x,z
f (x) + g(z) + λ (Ax + Bz − c) +
ρ
2
Ax + Bz − c 2
(6)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 7 / 21
ADMM
Formulation
Writing down the complete optimization problem, formulated till now, we
have,
p∗ = d∗ = sup
λ
inf
x,z
f (x) + g(z) + λ (Ax + Bz − c) +
ρ
2
Ax + Bz − c 2
(7)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 8 / 21
ADMM
Formulation
The problem can be restated as,
sup
λ
inf
z
inf
x
f (x) + g(z) + λ (Ax + Bz − c) +
ρ
2
Ax + Bz − c 2
(8)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 9 / 21
ADMM
Formulation
We try to solve the underlined problem first:
sup
λ
inf
z
inf
x
f (x) + g(z) + λ (Ax + Bz − c) +
ρ
2
Ax + Bz − c 2
(9)
To solve this, we follow the rule:
x(k+1)
:= arg min
x
L(x, z(k)
, λ(k)
) (10)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 10 / 21
ADMM
Formulation
We follow the Gauss Seidel Approach to update the remaining variables,
i.e. we use the updated values of the variables already updated. Now for
the problem underlined below:
sup
λ
inf
z
inf
x
f (x) + g(z) + λ (Ax + Bz − c) +
ρ
2
Ax + Bz − c 2
(11)
We use:
z(k+1)
:= arg min
z
L(x(k+1)
, z, λ(k)
) (12)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 11 / 21
ADMM
Formulation
Now we need to have an update of the variable λ, for this we need to solve
the outermost maximization problem. Finding the derivative of g w.r.t λ,
we get
g(λ) = Ax + Bz − c (13)
Now we go in the direction of ascent to increase the value of the function
g(.), so we write the following update rule,
λ(k+1)
:= λ(k)
+ ρ(Ax + Bz − c) (14)
Here the step size associated with the gradient is set equal to the
Augmented Lagrangian parameter ρ.
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 12 / 21
ADMM
Formulation
Thus, the final formulation of the ADMM algorithm is:
x(k+1)
:= arg min
x
L(x, z(k)
, λ(k)
) (15)
z(k+1)
:= arg min
z
L(x(k+1)
, z, λ(k)
) (16)
λ(k+1)
:= λ(k)
+ ρ(Ax + Bz − c) (17)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 13 / 21
Support Vector Machines
In Support Vector Machines, the main aim is to find a hyper-plane w that
separates data linearly in the space where the data resides. This can be
posed as the following optimization problem.g
Support Vector Machine
Given a dataset {(xi , yi )}l
i=1(xi ∈ Rn, yi ∈ −1, +1) in L2-regularized
L2-loss (squared hinge loss) SVM,
min
w
1
2
w 2
2 + C
l
i=1
max(1 − yi w xi , 0) (18)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 14 / 21
Distributed Support Vector Machines
The task of linear classification is now distributed among various
machines, for this each machine will have a different dataset each of
handleable size. Now to make the problem amenable to decomposition, we
first let {B1, B2, ..., Bm} be a partition of the data indices {1, 2, ..., l}.
SVM in the Distributed Setting
min
w
1
2
w 2
2 + C
m
j=1 i∈Bj
max(1 − yi w xi , 0) (19)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 15 / 21
Distributed Support Vector Machine
Let us say, each machine working with its own dataset finds an optimal w
for its iteration, so this implies that, there is no single w, there is a set of
such weight vectors which each machine tries to figure out. Thus we
represent each of then by wj for j = 1, 2, ..., m machines. Now we want
the global vector w to be a unique vector and not many vectors, so we
impose the condition a new artificial condition:
z = w1 = w2 = · · · = wm
SVM in the Distributed Setting
In this setting the distributed SVM takes the following form:
min
w1,...,wj ,z
1
2
z 2
2 + C
m
j=1 i∈Bj
max(1 − yi wj xi , 0) (20)
subject to wj − z = 0
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 16 / 21
Distributed Support Vector Machine
Now let us write the Augmented Lagrangian.
Augmented Lagrangian
L(w, z, λ) =
1
2
z 2
2 + C
m
j=1 i∈Bj
max(1 − yi wj xi , 0)
+
m
j=1
ρ
2
wj − z 2
2 + λj (wj − z) (21)
where, w := {w1, w2, ..., wm} and λ := {λ1, λ2, ..., λm}
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 17 / 21
The ADMM Way
Now we use ADMM to optimize the above Lagrangian. As seen in the
ADMM section, we have the following update rules.
ADMM on Distributed SVM
w(k+1)
= arg min
w
L(w, z(k)
, λ(k)
) (22)
z(k+1)
= arg min
z
L(w(k+1)
, z, λ(k)
) (23)
λ
(k+1)
j = λ
(k)
j + ρ w
(k+1)
j − zk+1
, j = 1, ..., m (24)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 18 / 21
The Update Equations
The problem in 22 can be parallelized to m machines with each machine
solving the following minimization problem:
w Update
w
(k+1)
j = arg min
w i∈Bj
max(1−yi w xi , 0)+
ρ
2
w −z(k) 2
2+λ
(k)
j w − z(k)
(2
j = 1, ..., m
z Update
z(k+1)
=
ρ m
i=1 w
(k+1)
j + λ
(k)
j
mρ + 1
(26)
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 19 / 21
References
Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J., 2011. Distributed
optimization and statistical learning via the alternating direction method of
multipliers. Foundations and Trends in Machine Learning, 3(1), pp.1-122.
Zhang, C., Lee, H. and Shin, K.G., 2012. Efficient Distributed Linear Classification
Algorithms via the Alternating Direction Method of Multipliers. In AISTATS (pp.
1398-1406).
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 20 / 21
The End
Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 21 / 21

More Related Content

What's hot

A non-stiff boundary integral method for internal waves
A non-stiff boundary integral method for internal wavesA non-stiff boundary integral method for internal waves
A non-stiff boundary integral method for internal waves
Alex (Oleksiy) Varfolomiyev
 
Green Theorem
Green TheoremGreen Theorem
Green Theorem
Sarwan Ursani
 
Daa unit 4
Daa unit 4Daa unit 4
Daa unit 4
Abhimanyu Mishra
 
A Note on Correlated Topic Models
A Note on Correlated Topic ModelsA Note on Correlated Topic Models
A Note on Correlated Topic ModelsTomonari Masada
 
cheb_conf_aksenov.pdf
cheb_conf_aksenov.pdfcheb_conf_aksenov.pdf
cheb_conf_aksenov.pdf
Alexey Vasyukov
 
Relaxation method
Relaxation methodRelaxation method
Relaxation method
Parinda Rajapaksha
 
Dda algorithm
Dda algorithmDda algorithm
Dda algorithm
Mani Kanth
 
Least Square Optimization and Sparse-Linear Solver
Least Square Optimization and Sparse-Linear SolverLeast Square Optimization and Sparse-Linear Solver
Least Square Optimization and Sparse-Linear Solver
Ji-yong Kwon
 
Line drawing algo.
Line drawing algo.Line drawing algo.
Line drawing algo.Mohd Arif
 
Treewidth and Applications
Treewidth and ApplicationsTreewidth and Applications
Treewidth and ApplicationsASPAK2014
 
Longest Common Subsequence & Matrix Chain Multiplication
Longest Common Subsequence & Matrix Chain MultiplicationLongest Common Subsequence & Matrix Chain Multiplication
Longest Common Subsequence & Matrix Chain Multiplication
JaneAlamAdnan
 
Vector calculus
Vector calculusVector calculus
Vector calculus
sujathavvv
 
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
Kobkrit Viriyayudhakorn
 
Matrix chain multiplication by MHM
Matrix chain multiplication by MHMMatrix chain multiplication by MHM
Matrix chain multiplication by MHM
Md Mosharof Hosen
 
Matrix Product (Modulo-2) Of Cycle Graphs
Matrix Product (Modulo-2) Of Cycle GraphsMatrix Product (Modulo-2) Of Cycle Graphs
Matrix Product (Modulo-2) Of Cycle Graphs
inventionjournals
 
Double Integral
Double IntegralDouble Integral
Double Integral
Keerthana Nambiar
 
IIR filter realization using direct form I & II
IIR filter realization using direct form I & IIIIR filter realization using direct form I & II
IIR filter realization using direct form I & II
Sarang Joshi
 
Image Restoration 2 (Digital Image Processing)
Image Restoration 2 (Digital Image Processing)Image Restoration 2 (Digital Image Processing)
Image Restoration 2 (Digital Image Processing)
VARUN KUMAR
 
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
The Statistical and Applied Mathematical Sciences Institute
 

What's hot (19)

A non-stiff boundary integral method for internal waves
A non-stiff boundary integral method for internal wavesA non-stiff boundary integral method for internal waves
A non-stiff boundary integral method for internal waves
 
Green Theorem
Green TheoremGreen Theorem
Green Theorem
 
Daa unit 4
Daa unit 4Daa unit 4
Daa unit 4
 
A Note on Correlated Topic Models
A Note on Correlated Topic ModelsA Note on Correlated Topic Models
A Note on Correlated Topic Models
 
cheb_conf_aksenov.pdf
cheb_conf_aksenov.pdfcheb_conf_aksenov.pdf
cheb_conf_aksenov.pdf
 
Relaxation method
Relaxation methodRelaxation method
Relaxation method
 
Dda algorithm
Dda algorithmDda algorithm
Dda algorithm
 
Least Square Optimization and Sparse-Linear Solver
Least Square Optimization and Sparse-Linear SolverLeast Square Optimization and Sparse-Linear Solver
Least Square Optimization and Sparse-Linear Solver
 
Line drawing algo.
Line drawing algo.Line drawing algo.
Line drawing algo.
 
Treewidth and Applications
Treewidth and ApplicationsTreewidth and Applications
Treewidth and Applications
 
Longest Common Subsequence & Matrix Chain Multiplication
Longest Common Subsequence & Matrix Chain MultiplicationLongest Common Subsequence & Matrix Chain Multiplication
Longest Common Subsequence & Matrix Chain Multiplication
 
Vector calculus
Vector calculusVector calculus
Vector calculus
 
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
[Lecture 3] AI and Deep Learning: Logistic Regression (Coding)
 
Matrix chain multiplication by MHM
Matrix chain multiplication by MHMMatrix chain multiplication by MHM
Matrix chain multiplication by MHM
 
Matrix Product (Modulo-2) Of Cycle Graphs
Matrix Product (Modulo-2) Of Cycle GraphsMatrix Product (Modulo-2) Of Cycle Graphs
Matrix Product (Modulo-2) Of Cycle Graphs
 
Double Integral
Double IntegralDouble Integral
Double Integral
 
IIR filter realization using direct form I & II
IIR filter realization using direct form I & IIIIR filter realization using direct form I & II
IIR filter realization using direct form I & II
 
Image Restoration 2 (Digital Image Processing)
Image Restoration 2 (Digital Image Processing)Image Restoration 2 (Digital Image Processing)
Image Restoration 2 (Digital Image Processing)
 
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
 

Similar to Distributed Support Vector Machines

Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Tomoya Murata
 
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
The Statistical and Applied Mathematical Sciences Institute
 
02 basics i-handout
02 basics i-handout02 basics i-handout
02 basics i-handout
sheetslibrary
 
Linear Machine Learning Models with L2 Regularization and Kernel Tricks
Linear Machine Learning Models with L2 Regularization and Kernel TricksLinear Machine Learning Models with L2 Regularization and Kernel Tricks
Linear Machine Learning Models with L2 Regularization and Kernel Tricks
Fengtao Wu
 
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
IJERA Editor
 
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
IJERA Editor
 
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
The Statistical and Applied Mathematical Sciences Institute
 
On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...
Nikita V. Artamonov
 
2013 IEEE International Symposium on Information Theory
2013 IEEE International Symposium on Information Theory2013 IEEE International Symposium on Information Theory
2013 IEEE International Symposium on Information Theory
Joe Suzuki
 
Mixed ISS systems
Mixed ISS systemsMixed ISS systems
Mixed ISS systems
MKosmykov
 
Numerical analysis m2 l4slides
Numerical analysis  m2 l4slidesNumerical analysis  m2 l4slides
Numerical analysis m2 l4slides
SHAMJITH KM
 
Bayesian Criteria based on Universal Measures
Bayesian Criteria based on Universal MeasuresBayesian Criteria based on Universal Measures
Bayesian Criteria based on Universal Measures
Joe Suzuki
 
Banco de preguntas para el ap
Banco de preguntas para el apBanco de preguntas para el ap
Banco de preguntas para el ap
MARCELOCHAVEZ23
 
Parallel Bayesian Optimization
Parallel Bayesian OptimizationParallel Bayesian Optimization
Parallel Bayesian Optimization
Sri Ambati
 
PART I.3 - Physical Mathematics
PART I.3 - Physical MathematicsPART I.3 - Physical Mathematics
PART I.3 - Physical Mathematics
Maurice R. TREMBLAY
 
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
Yuko Kuroki (黒木祐子)
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
VjekoslavKovac1
 
Hybrid dynamics in large-scale logistics networks
Hybrid dynamics in large-scale logistics networksHybrid dynamics in large-scale logistics networks
Hybrid dynamics in large-scale logistics networks
MKosmykov
 

Similar to Distributed Support Vector Machines (20)

Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
 
02 basics i-handout
02 basics i-handout02 basics i-handout
02 basics i-handout
 
A
AA
A
 
Linear Machine Learning Models with L2 Regularization and Kernel Tricks
Linear Machine Learning Models with L2 Regularization and Kernel TricksLinear Machine Learning Models with L2 Regularization and Kernel Tricks
Linear Machine Learning Models with L2 Regularization and Kernel Tricks
 
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
 
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
 
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
 
On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
2013 IEEE International Symposium on Information Theory
2013 IEEE International Symposium on Information Theory2013 IEEE International Symposium on Information Theory
2013 IEEE International Symposium on Information Theory
 
Mixed ISS systems
Mixed ISS systemsMixed ISS systems
Mixed ISS systems
 
Numerical analysis m2 l4slides
Numerical analysis  m2 l4slidesNumerical analysis  m2 l4slides
Numerical analysis m2 l4slides
 
Bayesian Criteria based on Universal Measures
Bayesian Criteria based on Universal MeasuresBayesian Criteria based on Universal Measures
Bayesian Criteria based on Universal Measures
 
Banco de preguntas para el ap
Banco de preguntas para el apBanco de preguntas para el ap
Banco de preguntas para el ap
 
Parallel Bayesian Optimization
Parallel Bayesian OptimizationParallel Bayesian Optimization
Parallel Bayesian Optimization
 
PART I.3 - Physical Mathematics
PART I.3 - Physical MathematicsPART I.3 - Physical Mathematics
PART I.3 - Physical Mathematics
 
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
 
Hybrid dynamics in large-scale logistics networks
Hybrid dynamics in large-scale logistics networksHybrid dynamics in large-scale logistics networks
Hybrid dynamics in large-scale logistics networks
 

Recently uploaded

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
chanes7
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 

Recently uploaded (20)

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 

Distributed Support Vector Machines

  • 1. Distributed SVM Harsha Vardhan IIT Gandhinagar harsha.tetali@iitgn.ac.in April 30, 2017 Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 1 / 21
  • 2. Overview 1 Alternating Direction Method of Multipliers Objective The Lagrangian Dual Formulation of ADMM 2 Distributed SVM Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 2 / 21
  • 3. ADMM Objective min x∈Rn,z∈Rm f (x) + g(z) (1) subject to Ax + Bz = c Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 3 / 21
  • 4. ADMM-Lagrangian Lagrangian without the Penalty Term Lρ(x, z, λ) = f (x) + g(z) + λ (Ax + Bz − c) (2) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 4 / 21
  • 5. ADMM-Lagrangian Lagrangian with the Penalty Term Lρ(x, z, λ) = f (x) + g(z) + λ (Ax + Bz − c) + ρ 2 Ax + Bz − c 2 (3) ρ > 0 is called the Augmented Lagrangian Parameter. This Lagrangian with added penalty term is also called the Augmented Lagrangian. Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 5 / 21
  • 6. ADMM Formulation We need the following: p∗ = Inf{f (x) + g(z)|Ax + Bz = c} (4) We have the dual problem formulated as: g(λ) = inf x,z f (x) + g(z) + λ (Ax + Bz − c) + ρ 2 Ax + Bz − c 2 (5) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 6 / 21
  • 7. ADMM Formulation Assuming that the saddle point of Lρ(x, z, λ) exists and that we have strong duality, we can write: p∗ = d∗ = sup λ inf x,z f (x) + g(z) + λ (Ax + Bz − c) + ρ 2 Ax + Bz − c 2 (6) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 7 / 21
  • 8. ADMM Formulation Writing down the complete optimization problem, formulated till now, we have, p∗ = d∗ = sup λ inf x,z f (x) + g(z) + λ (Ax + Bz − c) + ρ 2 Ax + Bz − c 2 (7) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 8 / 21
  • 9. ADMM Formulation The problem can be restated as, sup λ inf z inf x f (x) + g(z) + λ (Ax + Bz − c) + ρ 2 Ax + Bz − c 2 (8) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 9 / 21
  • 10. ADMM Formulation We try to solve the underlined problem first: sup λ inf z inf x f (x) + g(z) + λ (Ax + Bz − c) + ρ 2 Ax + Bz − c 2 (9) To solve this, we follow the rule: x(k+1) := arg min x L(x, z(k) , λ(k) ) (10) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 10 / 21
  • 11. ADMM Formulation We follow the Gauss Seidel Approach to update the remaining variables, i.e. we use the updated values of the variables already updated. Now for the problem underlined below: sup λ inf z inf x f (x) + g(z) + λ (Ax + Bz − c) + ρ 2 Ax + Bz − c 2 (11) We use: z(k+1) := arg min z L(x(k+1) , z, λ(k) ) (12) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 11 / 21
  • 12. ADMM Formulation Now we need to have an update of the variable λ, for this we need to solve the outermost maximization problem. Finding the derivative of g w.r.t λ, we get g(λ) = Ax + Bz − c (13) Now we go in the direction of ascent to increase the value of the function g(.), so we write the following update rule, λ(k+1) := λ(k) + ρ(Ax + Bz − c) (14) Here the step size associated with the gradient is set equal to the Augmented Lagrangian parameter ρ. Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 12 / 21
  • 13. ADMM Formulation Thus, the final formulation of the ADMM algorithm is: x(k+1) := arg min x L(x, z(k) , λ(k) ) (15) z(k+1) := arg min z L(x(k+1) , z, λ(k) ) (16) λ(k+1) := λ(k) + ρ(Ax + Bz − c) (17) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 13 / 21
  • 14. Support Vector Machines In Support Vector Machines, the main aim is to find a hyper-plane w that separates data linearly in the space where the data resides. This can be posed as the following optimization problem.g Support Vector Machine Given a dataset {(xi , yi )}l i=1(xi ∈ Rn, yi ∈ −1, +1) in L2-regularized L2-loss (squared hinge loss) SVM, min w 1 2 w 2 2 + C l i=1 max(1 − yi w xi , 0) (18) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 14 / 21
  • 15. Distributed Support Vector Machines The task of linear classification is now distributed among various machines, for this each machine will have a different dataset each of handleable size. Now to make the problem amenable to decomposition, we first let {B1, B2, ..., Bm} be a partition of the data indices {1, 2, ..., l}. SVM in the Distributed Setting min w 1 2 w 2 2 + C m j=1 i∈Bj max(1 − yi w xi , 0) (19) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 15 / 21
  • 16. Distributed Support Vector Machine Let us say, each machine working with its own dataset finds an optimal w for its iteration, so this implies that, there is no single w, there is a set of such weight vectors which each machine tries to figure out. Thus we represent each of then by wj for j = 1, 2, ..., m machines. Now we want the global vector w to be a unique vector and not many vectors, so we impose the condition a new artificial condition: z = w1 = w2 = · · · = wm SVM in the Distributed Setting In this setting the distributed SVM takes the following form: min w1,...,wj ,z 1 2 z 2 2 + C m j=1 i∈Bj max(1 − yi wj xi , 0) (20) subject to wj − z = 0 Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 16 / 21
  • 17. Distributed Support Vector Machine Now let us write the Augmented Lagrangian. Augmented Lagrangian L(w, z, λ) = 1 2 z 2 2 + C m j=1 i∈Bj max(1 − yi wj xi , 0) + m j=1 ρ 2 wj − z 2 2 + λj (wj − z) (21) where, w := {w1, w2, ..., wm} and λ := {λ1, λ2, ..., λm} Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 17 / 21
  • 18. The ADMM Way Now we use ADMM to optimize the above Lagrangian. As seen in the ADMM section, we have the following update rules. ADMM on Distributed SVM w(k+1) = arg min w L(w, z(k) , λ(k) ) (22) z(k+1) = arg min z L(w(k+1) , z, λ(k) ) (23) λ (k+1) j = λ (k) j + ρ w (k+1) j − zk+1 , j = 1, ..., m (24) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 18 / 21
  • 19. The Update Equations The problem in 22 can be parallelized to m machines with each machine solving the following minimization problem: w Update w (k+1) j = arg min w i∈Bj max(1−yi w xi , 0)+ ρ 2 w −z(k) 2 2+λ (k) j w − z(k) (2 j = 1, ..., m z Update z(k+1) = ρ m i=1 w (k+1) j + λ (k) j mρ + 1 (26) Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 19 / 21
  • 20. References Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J., 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), pp.1-122. Zhang, C., Lee, H. and Shin, K.G., 2012. Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers. In AISTATS (pp. 1398-1406). Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 20 / 21
  • 21. The End Harsha Vardhan (IIT Gandhinagar) ADMM April 30, 2017 21 / 21