Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Alex Smola, Director of Machine Lea... by MLconf 2480 views
- Caroline Sinders, Online Harassment... by MLconf 522 views
- Aaron Roth, Associate Professor, Un... by MLconf 991 views
- Scott Clark, CEO, SigOpt, at The AI... by MLconf 734 views
- Jonathan Lenaghan, VP of Science an... by MLconf 836 views
- Ross Goodwin, Technologist, Sunspri... by MLconf 475 views

Why Machine Learning Algorithms Fall Short (And What You Can Do About It): Many think that machine learning is all about the algorithms. Want a self-learning system? Get your data, start coding or hire a PhD that will build you a model that will stand the test of time. Of course we know that this is not enough. Models degrade over time, algorithms that work great on yesterday’s data may not be the best option, new data sources and types are made available. In short, your self-learning system may not be learning anything at all. In this session, we will examine how to overcome challenges in creating self-learning systems that perform better and are built to stand the test of time. We will show how to apply mathematical optimization algorithms that often prove superior to local optimization methods favored by typical machine learning applications and discuss why these methods can crate better results. We will also examine the role of smart automation in the context of machine learning and how smart automation can create self-learning systems that are built to last.

No Downloads

Total views

2,145

On SlideShare

0

From Embeds

0

Number of Embeds

11

Shares

0

Downloads

97

Comments

10

Likes

5

No notes for slide

© 2014 IBM Corporation

© 2014 IBM Corporation

© 2014 IBM Corporation

© 2014 IBM Corporation

- 1. © 2016 IBM CorporationIBM Confidential From ML Algorithms To Learning Machines (+ Optimization) Jean-François Puget 11/11/2016 @JFPuget
- 2. © 2016 IBM Corporation. IBM Confidential2 • 25 years ago, academic topic• The Machine Learning Workflow Data ML algorithm ? publication
- 3. © 2016 IBM Corporation. IBM Confidential3 • Perception now• The Machine Learning Workflow Data ??? ML Algorithm ??? $$$
- 4. © 2016 IBM Corporation. IBM Confidential4 • Simple!• The Machine Learning Workflow Data Data Scientist ML Algorithm Model $$$ R, Sklearn, Spark ML, Deep Learning, GBM (xgboost), vw, H2O, …
- 5. © 2016 IBM Corporation. IBM Confidential5 • Focus on missing pieces• The Machine Learning Workflow Data ??? ML Algorithm ??? $$$
- 6. © 2016 IBM Corporation. IBM Confidential6 • Not that simple• The Machine Learning Workflow Data Data Prep ML Algo Model Deploy Predict $$$
- 7. © 2016 IBM Corporation. IBM Confidential7 The gap between data scientists and operations is incredible
- 8. © 2016 IBM Corporation. IBM Confidential8 AlgorithmData prep Data prem Scoring Labeled examples Training Scoring New data Model Model Predicted data Deploy Dev Ops For each ML toolkit we need model serialization + scalable scoring engine We are building that for Spark ML
- 9. © 2016 IBM Corporation. IBM Confidential9 • Not that simple• The Machine Learning Workflow Data Data Prep ML Algo Model Deploy Predict $$$
- 10. © 2016 IBM Corporation Cognitive Assistant for Data Scientists • Objective: • Bring automation into key areas of large-scale data analysis tasks • Overcome “analytic decision overload” for Data Scientists • Current CADS System • Automated selection, composition, configuration, training, and deployment of modeling pipelines for supervised data mining tasks that leverages: • AI/Learning and Planning based principled exploration of analytic choices • Cross-platform analytic deployments (e.g., R, Spark, Python, SPSS) on Big Data platforms Cloud • What is next…. • Automation of more parts of the Data Scientists workflow (e.g. automated feature engineering) • Extend for other problems, data types, scale and user requirements (e.g., unstructured data, Deep Learning) • Self-Learning andAdaptation • Build first-ever conversational data science system with CADS +Watson QA IBM Research10
- 11. © 2016 IBM Corporation. IBM Confidential11 SystemML 11 IBM Research Hadoop or Spark Cluster (scale-out) In-Memory Single Node (scale-up) Runtime Compiler Language DML Scripts DML (Declarative Machine Learning Language) since 2010 since 2012 since 2015 Linear Regression Conjugate Gradient
- 12. © 2016 IBM Corporation. IBM Confidential12 • Pain points• The Machine Learning Workflow Data Data Prep ML Algo Model Deploy Predict $$$
- 13. © 2016 IBM Corporation. IBM Confidential13 • Feedback loop• The Machine Learning Workflow Data Data Prep ML Algo Model Deploy Predict $$$ Prediction acuracy monitoring: Collect predictions vs actuals
- 14. © 2016 IBM Corporation. IBM Confidential14 Cognitive = Natural language processing + Machine Learning + … What about Watson and cognitive computing ?
- 15. © 2016 IBM Corporation. IBM Confidential15 Machine Learning and Mathematical Optimization Most ML algorithms solve an optimization problem: find paramaters for a given model family that minimize Loss function (prediction error) Model simplicity (regularization) Optimization algorithms: local methods Stochastic gradient descent, conjugate gradient, LBFGS, … Scale to large number of examples Embarrassingly parallel Can be stuck in local minima Hard time coping with additional constraints on the optimization problem Mathematical optimization (e.g. CPLEX) Can find global optimum Can deal with constraints, eg L0 norm Limited in scale
- 16. © 2016 IBM Corporation. IBM Confidential16 Classical ML Algorithms implemented with mathematical optimization models Linear models: LASSO, Ridge Classifier, Elastic Net, Hinge loss, Hinge-squared loss Support Vector Machines: Primal, Dual linear, Dual RBF, Hinge models Decision Forests: Decision trees vote (preliminary work) Multi-label problems: Using 1-vs-rest method Alternating Least Squares: Application to Collaborative Filtering (recommendations) LASSO
- 17. © 2016 IBM Corporation. IBM Confidential17 Compressive Sensing Image reconstruction with and without bounds on the pixel value Original Lasso (sklearn) Constrained Lasso (CPLEX) Distribution of pixel values
- 18. © 2016 IBM Corporation. IBM Confidential18 Matrix factorization Used in recommendation systems User profiles x movie profiles = observed interactions
- 19. © 2016 IBM Corporation. IBM Confidential19 Aternating Least Square with additional constraints (Hugues Juille)
- 20. © 2016 IBM Corporation. IBM Confidential20 References IBM Watson Machine Learning: http://datascience.ibm.com/registration/stepone System ML: https://systemml.apache.org/ CADS: ICML 2014 CPLEX-learn Contributors: Jean-Francois Puget, Paul Shaw, Vincent Beraudier, Pierre Bonami, Daniel Junglas, Hugues Juille, Renaud Dumeur, Viu Long Kong, Philippe Couronne

No public clipboards found for this slide

Login to see the comments