Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Alex Smola, Director of Machine Lea... by MLconf 1731 views
- Caroline Sinders, Online Harassment... by MLconf 306 views
- Aaron Roth, Associate Professor, Un... by MLconf 452 views
- Scott Clark, CEO, SigOpt, at The AI... by MLconf 514 views
- Jonathan Lenaghan, VP of Science an... by MLconf 464 views
- Ross Goodwin, Technologist, Sunspri... by MLconf 264 views

1,704 views

Published on

Published in:
Technology

No Downloads

Total views

1,704

On SlideShare

0

From Embeds

0

Number of Embeds

7

Shares

0

Downloads

90

Comments

0

Likes

5

No embeds

No notes for slide

© 2014 IBM Corporation

© 2014 IBM Corporation

© 2014 IBM Corporation

© 2014 IBM Corporation

- 1. © 2016 IBM CorporationIBM Confidential From ML Algorithms To Learning Machines (+ Optimization) Jean-François Puget 11/11/2016 @JFPuget
- 2. © 2016 IBM Corporation. IBM Confidential2 • 25 years ago, academic topic• The Machine Learning Workflow Data ML algorithm ? publication
- 3. © 2016 IBM Corporation. IBM Confidential3 • Perception now• The Machine Learning Workflow Data ??? ML Algorithm ??? $$$
- 4. © 2016 IBM Corporation. IBM Confidential4 • Simple!• The Machine Learning Workflow Data Data Scientist ML Algorithm Model $$$ R, Sklearn, Spark ML, Deep Learning, GBM (xgboost), vw, H2O, …
- 5. © 2016 IBM Corporation. IBM Confidential5 • Focus on missing pieces• The Machine Learning Workflow Data ??? ML Algorithm ??? $$$
- 6. © 2016 IBM Corporation. IBM Confidential6 • Not that simple• The Machine Learning Workflow Data Data Prep ML Algo Model Deploy Predict $$$
- 7. © 2016 IBM Corporation. IBM Confidential7 The gap between data scientists and operations is incredible
- 8. © 2016 IBM Corporation. IBM Confidential8 AlgorithmData prep Data prem Scoring Labeled examples Training Scoring New data Model Model Predicted data Deploy Dev Ops For each ML toolkit we need model serialization + scalable scoring engine We are building that for Spark ML
- 9. © 2016 IBM Corporation. IBM Confidential9 • Not that simple• The Machine Learning Workflow Data Data Prep ML Algo Model Deploy Predict $$$
- 10. © 2016 IBM Corporation Cognitive Assistant for Data Scientists • Objective: • Bring automation into key areas of large-scale data analysis tasks • Overcome “analytic decision overload” for Data Scientists • Current CADS System • Automated selection, composition, configuration, training, and deployment of modeling pipelines for supervised data mining tasks that leverages: • AI/Learning and Planning based principled exploration of analytic choices • Cross-platform analytic deployments (e.g., R, Spark, Python, SPSS) on Big Data platforms Cloud • What is next…. • Automation of more parts of the Data Scientists workflow (e.g. automated feature engineering) • Extend for other problems, data types, scale and user requirements (e.g., unstructured data, Deep Learning) • Self-Learning andAdaptation • Build first-ever conversational data science system with CADS +Watson QA IBM Research10
- 11. © 2016 IBM Corporation. IBM Confidential11 SystemML 11 IBM Research Hadoop or Spark Cluster (scale-out) In-Memory Single Node (scale-up) Runtime Compiler Language DML Scripts DML (Declarative Machine Learning Language) since 2010 since 2012 since 2015 Linear Regression Conjugate Gradient
- 12. © 2016 IBM Corporation. IBM Confidential12 • Pain points• The Machine Learning Workflow Data Data Prep ML Algo Model Deploy Predict $$$
- 13. © 2016 IBM Corporation. IBM Confidential13 • Feedback loop• The Machine Learning Workflow Data Data Prep ML Algo Model Deploy Predict $$$ Prediction acuracy monitoring: Collect predictions vs actuals
- 14. © 2016 IBM Corporation. IBM Confidential14 Cognitive = Natural language processing + Machine Learning + … What about Watson and cognitive computing ?
- 15. © 2016 IBM Corporation. IBM Confidential15 Machine Learning and Mathematical Optimization Most ML algorithms solve an optimization problem: find paramaters for a given model family that minimize Loss function (prediction error) Model simplicity (regularization) Optimization algorithms: local methods Stochastic gradient descent, conjugate gradient, LBFGS, … Scale to large number of examples Embarrassingly parallel Can be stuck in local minima Hard time coping with additional constraints on the optimization problem Mathematical optimization (e.g. CPLEX) Can find global optimum Can deal with constraints, eg L0 norm Limited in scale
- 16. © 2016 IBM Corporation. IBM Confidential16 Classical ML Algorithms implemented with mathematical optimization models Linear models: LASSO, Ridge Classifier, Elastic Net, Hinge loss, Hinge-squared loss Support Vector Machines: Primal, Dual linear, Dual RBF, Hinge models Decision Forests: Decision trees vote (preliminary work) Multi-label problems: Using 1-vs-rest method Alternating Least Squares: Application to Collaborative Filtering (recommendations) LASSO
- 17. © 2016 IBM Corporation. IBM Confidential17 Compressive Sensing Image reconstruction with and without bounds on the pixel value Original Lasso (sklearn) Constrained Lasso (CPLEX) Distribution of pixel values
- 18. © 2016 IBM Corporation. IBM Confidential18 Matrix factorization Used in recommendation systems User profiles x movie profiles = observed interactions
- 19. © 2016 IBM Corporation. IBM Confidential19 Aternating Least Square with additional constraints (Hugues Juille)
- 20. © 2016 IBM Corporation. IBM Confidential20 References IBM Watson Machine Learning: http://datascience.ibm.com/registration/stepone System ML: https://systemml.apache.org/ CADS: ICML 2014 CPLEX-learn Contributors: Jean-Francois Puget, Paul Shaw, Vincent Beraudier, Pierre Bonami, Daniel Junglas, Hugues Juille, Renaud Dumeur, Viu Long Kong, Philippe Couronne

No public clipboards found for this slide

Be the first to comment