Machine Learning 101
Fred Verheul
Machine Learning
"Field of study that gives computers the ability to learn
without being explicitly programmed” (Arthur Samuel, 1959)
2
What is Machine Learning?
3
Computer
Computer
Traditional Programming
Machine Learning
Data
Data
Program
Output
Program
Output
Prediction is hard…
4
Sweet spot for Machine Learning
• It’s impossible to write down the rules in code:
• Too many rules
• Too many factors influencing the rules
• Too finely tuned
• We just don’t know the rules (image recognition)
• Lots of labeled data (examples) available (e.g. historical data)
5
Basic Machine Learning ‘workflow’
6
Feature
Vectors
Training
data
Labels
Machine
Learning
Algorithm
Feature
Vectors
New data Prediction
Training Phase
Operational Phase
Predictive
Model
Training Phase in more detail
7
Raw data
Data
preparation Feature
Vectors
Training
Data
Test
data
Model Building
(by ML
algorithm)
Model
Evaluation
Predictive
Model
Feedback loop
data cleansing
data transformation
normalization
feature extraction
aka
‘learning’
Examples of ML tasks
Supervised learning
Regression 
target is numeric
Classification 
target is categorical
8
Unsupervised learning
Clustering
Dimensionality
reduction
Modeling: so many algorithms…
9
ML Algorithms: by Representation
Collection of candidate models/programs, aka hypothesis space
10
Decision trees
Instance-based
Neural networks
Model ensembles
ML Algorithms: by Evaluation
Evaluation: Quality measure for a model
11
Regression
Example metric: Root Mean Squared Error
RMSE =
Binary classification: confusion matrix
Accuracy: 8 + 971 -> 97,9%
Example: medical test
for a disease
Positive Negative
P
True
positives
TP
False
Negatives
FN
N
False
positives
FP
True
Negatives
TN
True
Class
Predicted class
Accuracy: Better evaluation metrics:
• Precision: 8 / (8 + 19)
• Recall: 8 / (8 + 2)
Optimization: how the algorithm ‘learns’, depends on representation and
evaluation
ML Algorithms: by Optimization
12
Greedy Search,
ex. of
combinatorial
optimization
Gradient Descent (or in general: Convex Optimization)
Linear Programming (or in general:
Constrained/Nonlinear Optimization)
Training error vs test error
13
Data Science for Business
• Focuses more on general principles
than specific algorithms
• Not math-heavy, does contain some
math
• O’Reilly link:
http://shop.oreilly.com/product/063692
0028918.do
• Book website: http://data-science-for-
biz.com/DSB/Home.html
14
What has NOT been covered (1)
• Deep learning / Neural Networks
• Covered in other presentations at DKOM
• Also recommended for further reading (deep dive):
• http://neuralnetworksanddeeplearning.com/index.html
• Specifics of ML-algorithms
• All over the internet… e.g. at http://machinelearningmastery.com/
15
What has NOT been covered (2)
• Libraries (examples):
• Tensorflow, Caffe, Theano, Keras
• SciPy & scikit-learn
• Spark MLLib (Scala/Java/Python)
• Programming languages:
16
What has NOT been covered (3)
• SAP products:
• SAP HANA, SAP HANA Vora, SAP
BO Predictive Analytics(!), HCP
Predictive Services
• New machine learning platform
• Hardware
• Nvidia talk about GPUs
17
What has NOT been covered (4)
• Ethics and algorithmic
transparency:
18
What has NOT been covered (5)
• The Data Science &
Data Mining Process:
19
What has NOT been covered (6)
• How to integrate ML into your business
application
• I hope SAP is figuring that out as we speak ;-)
• Have a look at SAP Predictive Analytics Integrator
• https://help.sap.com/pai
20
Take-aways
• Goal of ML: generalize from training data (not optimization!!)
• No magic! Just some clever algorithms…
• Increasingly important non-technical aspects:
• Ethics
• Algorithmic transparency
21
Thank You
www.soapeople.com
info@soapeople.com
@SOAPEOPLE
Fred Verheul
Big Data Consultant
+31 6 3919 2986
fred.verheul@soapeople.com

Machine learning 101 dkom 2017

  • 1.
  • 2.
    Machine Learning "Field ofstudy that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959) 2
  • 3.
    What is MachineLearning? 3 Computer Computer Traditional Programming Machine Learning Data Data Program Output Program Output
  • 4.
  • 5.
    Sweet spot forMachine Learning • It’s impossible to write down the rules in code: • Too many rules • Too many factors influencing the rules • Too finely tuned • We just don’t know the rules (image recognition) • Lots of labeled data (examples) available (e.g. historical data) 5
  • 6.
    Basic Machine Learning‘workflow’ 6 Feature Vectors Training data Labels Machine Learning Algorithm Feature Vectors New data Prediction Training Phase Operational Phase Predictive Model
  • 7.
    Training Phase inmore detail 7 Raw data Data preparation Feature Vectors Training Data Test data Model Building (by ML algorithm) Model Evaluation Predictive Model Feedback loop data cleansing data transformation normalization feature extraction aka ‘learning’
  • 8.
    Examples of MLtasks Supervised learning Regression  target is numeric Classification  target is categorical 8 Unsupervised learning Clustering Dimensionality reduction
  • 9.
    Modeling: so manyalgorithms… 9
  • 10.
    ML Algorithms: byRepresentation Collection of candidate models/programs, aka hypothesis space 10 Decision trees Instance-based Neural networks Model ensembles
  • 11.
    ML Algorithms: byEvaluation Evaluation: Quality measure for a model 11 Regression Example metric: Root Mean Squared Error RMSE = Binary classification: confusion matrix Accuracy: 8 + 971 -> 97,9% Example: medical test for a disease Positive Negative P True positives TP False Negatives FN N False positives FP True Negatives TN True Class Predicted class Accuracy: Better evaluation metrics: • Precision: 8 / (8 + 19) • Recall: 8 / (8 + 2)
  • 12.
    Optimization: how thealgorithm ‘learns’, depends on representation and evaluation ML Algorithms: by Optimization 12 Greedy Search, ex. of combinatorial optimization Gradient Descent (or in general: Convex Optimization) Linear Programming (or in general: Constrained/Nonlinear Optimization)
  • 13.
    Training error vstest error 13
  • 14.
    Data Science forBusiness • Focuses more on general principles than specific algorithms • Not math-heavy, does contain some math • O’Reilly link: http://shop.oreilly.com/product/063692 0028918.do • Book website: http://data-science-for- biz.com/DSB/Home.html 14
  • 15.
    What has NOTbeen covered (1) • Deep learning / Neural Networks • Covered in other presentations at DKOM • Also recommended for further reading (deep dive): • http://neuralnetworksanddeeplearning.com/index.html • Specifics of ML-algorithms • All over the internet… e.g. at http://machinelearningmastery.com/ 15
  • 16.
    What has NOTbeen covered (2) • Libraries (examples): • Tensorflow, Caffe, Theano, Keras • SciPy & scikit-learn • Spark MLLib (Scala/Java/Python) • Programming languages: 16
  • 17.
    What has NOTbeen covered (3) • SAP products: • SAP HANA, SAP HANA Vora, SAP BO Predictive Analytics(!), HCP Predictive Services • New machine learning platform • Hardware • Nvidia talk about GPUs 17
  • 18.
    What has NOTbeen covered (4) • Ethics and algorithmic transparency: 18
  • 19.
    What has NOTbeen covered (5) • The Data Science & Data Mining Process: 19
  • 20.
    What has NOTbeen covered (6) • How to integrate ML into your business application • I hope SAP is figuring that out as we speak ;-) • Have a look at SAP Predictive Analytics Integrator • https://help.sap.com/pai 20
  • 21.
    Take-aways • Goal ofML: generalize from training data (not optimization!!) • No magic! Just some clever algorithms… • Increasingly important non-technical aspects: • Ethics • Algorithmic transparency 21
  • 22.
    Thank You www.soapeople.com info@soapeople.com @SOAPEOPLE Fred Verheul BigData Consultant +31 6 3919 2986 fred.verheul@soapeople.com

Editor's Notes

  • #4 This diagram is attributed to Pedro Domingos who used it in his Coursera Machine Learning course in 2012.
  • #5 Source: http://timoelliott.com/blog/2007/11/thanksgiving_predictive_analyt.html
  • #9 Sources: Regression - http://gerardnico.com/wiki/data_mining/linear_regression Classification - ?? Clustering - https://en.wikipedia.org/wiki/Cluster_analysis Dimensionality reduction: http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization
  • #10 Source: http://machinelearningmastery.com/
  • #11 Sources: Decision Tree - https://en.wikipedia.org/wiki/Decision_tree_learning Instance-based - https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm Neural Networks - https://en.wikipedia.org/wiki/Artificial_neural_network Ensembles - https://www.analyticsvidhya.com/blog/2015/09/questions-ensemble-modeling/
  • #13 Sources: Greedy Search - https://en.wikipedia.org/wiki/Greedy_algorithm Gradient Descent - ?? Linear Programming - http://courses.wccnet.edu/~palay/math181/linearprogramming.htm
  • #14 Source: https://onlinecourses.science.psu.edu/stat857/node/160