Jean-François Puget, Distinguished Engineer, Machine Learning and Optimization, IBM at MLconf SF 2016

© 2016 IBM CorporationIBM Confidential
From ML Algorithms
To Learning Machines
(+ Optimization)
Jean-François Puget
11/11/2016
@JFPuget

© 2016 IBM Corporation. IBM Confidential2
• 25 years ago, academic topic• The Machine
Learning
Workflow
Data
ML
algorithm ? publication

• Perception now• The Machine
Learning
Workflow
Data ???
ML
Algorithm
??? $$$

• Simple!• The Machine
Learning
Workflow
Data
Data
Scientist
ML
Algorithm
Model $$$
R, Sklearn,
Spark ML,
Deep Learning,
GBM (xgboost),
vw, H2O, …

• Focus on missing pieces• The Machine
Learning
Workflow
Data ???
ML
Algorithm
??? $$$

• Not that simple• The Machine
Learning
Workflow
Data
Data
Prep
ML Algo Model Deploy Predict $$$

The gap between data scientists and operations is incredible

AlgorithmData prep
Data prem Scoring
Labeled
examples
Training
Scoring
New
data
Model
Model
Predicted
data
Deploy
Dev
Ops
For each ML toolkit we need model serialization + scalable scoring engine
We are building that for Spark ML

• Not that simple• The Machine
Learning
Workflow
Data
Data
Prep

© 2016 IBM Corporation
Cognitive Assistant for Data Scientists
• Objective:
• Bring automation into key areas of large-scale data analysis tasks
• Overcome “analytic decision overload” for Data Scientists
• Current CADS System
• Automated selection, composition, configuration, training, and deployment of modeling pipelines for
supervised data mining tasks that leverages:
• AI/Learning and Planning based principled exploration of analytic choices
• Cross-platform analytic deployments (e.g., R, Spark, Python, SPSS) on Big Data platforms  Cloud
• What is next….
• Automation of more parts of the Data Scientists workflow (e.g. automated feature engineering)
• Extend for other problems, data types, scale and user requirements (e.g., unstructured data, Deep Learning)
• Self-Learning andAdaptation
• Build first-ever conversational data science system with CADS +Watson QA
IBM Research10

SystemML
11
IBM Research
Hadoop or Spark Cluster
(scale-out)
In-Memory Single Node
(scale-up)
Runtime
Compiler
Language
DML Scripts
DML (Declarative Machine
Learning Language)
since
2010
since
2012
since
2015
Linear Regression Conjugate Gradient

• Pain points• The Machine
Learning
Workflow
Data
Data
Prep

• Feedback loop• The Machine
Learning
Workflow
Data
Data
Prep
Prediction acuracy monitoring:
Collect predictions vs actuals

Cognitive = Natural language processing + Machine Learning + …
What about Watson and cognitive computing ?

Machine Learning and Mathematical Optimization
 Most ML algorithms solve an optimization problem: find paramaters for a given model family
that minimize
 Loss function (prediction error)
 Model simplicity (regularization)
 Optimization algorithms: local methods
 Stochastic gradient descent, conjugate gradient, LBFGS, …
 Scale to large number of examples
 Embarrassingly parallel
 Can be stuck in local minima
 Hard time coping with additional constraints on the optimization problem
 Mathematical optimization (e.g. CPLEX)
 Can find global optimum
 Can deal with constraints, eg L0 norm
 Limited in scale

Classical ML Algorithms implemented with mathematical optimization
models
 Linear models: LASSO, Ridge Classifier, Elastic Net, Hinge loss, Hinge-squared loss
 Support Vector Machines: Primal, Dual linear, Dual RBF, Hinge models
 Decision Forests: Decision trees vote (preliminary work)
 Multi-label problems: Using 1-vs-rest method
 Alternating Least Squares: Application to Collaborative Filtering (recommendations)
LASSO

Compressive Sensing
 Image reconstruction
with and without
bounds on the pixel
value
Original Lasso (sklearn) Constrained
Lasso
(CPLEX)
Distribution
of
pixel values

Matrix factorization
Used in recommendation systems
User profiles x movie profiles = observed interactions

Aternating Least Square
with additional constraints
(Hugues Juille)

References
 IBM Watson Machine Learning: http://datascience.ibm.com/registration/stepone
 System ML: https://systemml.apache.org/
 CADS: ICML 2014
 CPLEX-learn Contributors: Jean-Francois Puget, Paul Shaw, Vincent Beraudier, Pierre Bonami, Daniel
Junglas, Hugues Juille, Renaud Dumeur, Viu Long Kong, Philippe Couronne

Jean-François Puget, Distinguished Engineer, Machine Learning and Optimization, IBM at MLconf SF 2016

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Jean-François Puget, Distinguished Engineer, Machine Learning and Optimization, IBM at MLconf SF 2016

Similar to Jean-François Puget, Distinguished Engineer, Machine Learning and Optimization, IBM at MLconf SF 2016 (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Jean-François Puget, Distinguished Engineer, Machine Learning and Optimization, IBM at MLconf SF 2016

Editor's Notes