Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
COCOA 
Communication-Efficient 
Coordinate Ascent 
Virginia Smith 
! 
Martin Jaggi, Martin Takáč, Jonathan Terhorst, 
Sanj...
LARGE-SCALE OPTIMIZATION
LARGE-SCALE OPTIMIZATION 
COCOA
LARGE-SCALE OPTIMIZATION 
COCOA 
RESULTS
LARGE-SCALE OPTIMIZATION! 
COCOA 
RESULTS
Machine Learning with 
Large Datasets
Machine Learning with 
Large Datasets
Machine Learning with 
Large Datasets 
image/music/video tagging 
document categorization 
item recommendation 
click-thro...
Machine Learning Workflow
Machine Learning Workflow 
DATA & PROBLEM 
classification, regression, 
collaborative filtering, …
Machine Learning Workflow 
DATA & PROBLEM 
classification, regression, 
collaborative filtering, … 
MACHINE LEARNING MODEL...
Machine Learning Workflow 
DATA & PROBLEM 
classification, regression, 
collaborative filtering, … 
MACHINE LEARNING MODEL...
Example: SVM Classification
Example: SVM Classification
Example: SVM Classification
Example: SVM Classification
Example: SVM Classification
Example: SVM Classification
Example: SVM Classification 
wx-b=1 
wx-b=0 w 
2 / ||w|| 
wx-b=-1
Example: SVM Classification 
wx-b=1 
wx-b=0 w 
2 / ||w|| 
wx-b=-1 
min 
w2Rd 
! 
2 ||w||2 + 
1 
n 
Xn 
i=1 
`hinge(yiwT xi...
Example: SVM Classification 
wx-b=1 
wx-b=0 w 
2 / ||w|| 
wx-b=-1 
min 
w2Rd 
! 
2 ||w||2 + 
1 
n 
Xn 
i=1 
`hinge(yiwT xi...
Machine Learning Workflow 
DATA & PROBLEM 
classification, regression, 
collaborative filtering, … 
MACHINE LEARNING MODEL...
Machine Learning Workflow 
DATA & PROBLEM 
classification, regression, 
collaborative filtering, … 
MACHINE LEARNING MODEL...
Machine Learning Workflow 
DATA & PROBLEM 
classification, regression, 
collaborative filtering, … 
Open Problem: 
MACHINE...
Distributed Optimization
Distributed Optimization
Distributed Optimization 
reduce: w = w
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
Upcoming SlideShare
Loading in …5
×

COCOA: Communication-Efficient Coordinate Ascent

3,212 views

Published on

"COCOA: Communication-Efficient Coordinate Ascent" presentation from AMPCamp 5 by Virginia Smith

Published in: Software
  • Be the first to comment

COCOA: Communication-Efficient Coordinate Ascent

  1. 1. COCOA Communication-Efficient Coordinate Ascent Virginia Smith ! Martin Jaggi, Martin Takáč, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, & Michael I. Jordan
  2. 2. LARGE-SCALE OPTIMIZATION
  3. 3. LARGE-SCALE OPTIMIZATION COCOA
  4. 4. LARGE-SCALE OPTIMIZATION COCOA RESULTS
  5. 5. LARGE-SCALE OPTIMIZATION! COCOA RESULTS
  6. 6. Machine Learning with Large Datasets
  7. 7. Machine Learning with Large Datasets
  8. 8. Machine Learning with Large Datasets image/music/video tagging document categorization item recommendation click-through rate prediction sequence tagging protein structure prediction sensor data prediction spam classification fraud detection
  9. 9. Machine Learning Workflow
  10. 10. Machine Learning Workflow DATA & PROBLEM classification, regression, collaborative filtering, …
  11. 11. Machine Learning Workflow DATA & PROBLEM classification, regression, collaborative filtering, … MACHINE LEARNING MODEL logistic regression, lasso, support vector machines, …
  12. 12. Machine Learning Workflow DATA & PROBLEM classification, regression, collaborative filtering, … MACHINE LEARNING MODEL logistic regression, lasso, support vector machines, … OPTIMIZATION ALGORITHM gradient descent, coordinate descent, Newton’s method, …
  13. 13. Example: SVM Classification
  14. 14. Example: SVM Classification
  15. 15. Example: SVM Classification
  16. 16. Example: SVM Classification
  17. 17. Example: SVM Classification
  18. 18. Example: SVM Classification
  19. 19. Example: SVM Classification wx-b=1 wx-b=0 w 2 / ||w|| wx-b=-1
  20. 20. Example: SVM Classification wx-b=1 wx-b=0 w 2 / ||w|| wx-b=-1 min w2Rd ! 2 ||w||2 + 1 n Xn i=1 `hinge(yiwT xi)
  21. 21. Example: SVM Classification wx-b=1 wx-b=0 w 2 / ||w|| wx-b=-1 min w2Rd ! 2 ||w||2 + 1 n Xn i=1 `hinge(yiwT xi) Descent algorithms and line search methods Acceleration, momentum, and conjugate gradients Newton and Quasi-Newton methods Coordinate descent Stochastic and incremental gradient methods SMO SVMlight LIBLINEAR
  22. 22. Machine Learning Workflow DATA & PROBLEM classification, regression, collaborative filtering, … MACHINE LEARNING MODEL logistic regression, lasso, support vector machines, … OPTIMIZATION ALGORITHM gradient descent, coordinate descent, Newton’s method, …
  23. 23. Machine Learning Workflow DATA & PROBLEM classification, regression, collaborative filtering, … MACHINE LEARNING MODEL logistic regression, lasso, support vector machines, … OPTIMIZATION ALGORITHM gradient descent, coordinate descent, Newton’s method, … SYSTEMS SETTING multi-core, cluster, cloud, supercomputer, …
  24. 24. Machine Learning Workflow DATA & PROBLEM classification, regression, collaborative filtering, … Open Problem: MACHINE LEARNING MODEL logistic regression, lasso, support vector machines, … efficiently solving objective when data is distributed OPTIMIZATION ALGORITHM gradient descent, coordinate descent, Newton’s method, … SYSTEMS SETTING multi-core, cluster, cloud, supercomputer, …
  25. 25. Distributed Optimization
  26. 26. Distributed Optimization
  27. 27. Distributed Optimization reduce: w = w

×