SlideShare a Scribd company logo
1 of 40
Practical ML
Antonio Pitasi
Samuele Sabella
2019 - Polo Fibonacci
Before we get started, we need some theory
Machine learning
● Practice: We define machine learning as a set of methods that can
automatically detect patterns in data, and then use the uncovered patterns to
predict future data, or to perform other kinds of decision making under
uncertainty
● Theory: How does learning performance vary with the number of training
examples presented? Which learning algorithms are most appropriate for
various types of learning tasks?
- Machine Learning, Tom Mitchell
- Machine Learning: A Probabilistic Perspective, Kevin P.
Murphy
ML is not only artificial neural networks
● Lots of mathematical models
○ Hidden Markov models
○ Support Vector Machines
○ Decision trees
○ Boltzmann machines, Deep belief network, Deep Boltzmann
● Neural network models are many...
○ Shallow network, Deep neural network
○ CNN (Yolo, AlexNet, GoogLeNet)
○ Echo state network, Deep echo state network
○ Rnn, LSTM, GRU
Machine learning categories
● Supervised: the goal is to learn a mapping from input to output given a
dataset of labeled pairs called training set (e.g. Iris Data Set
● Unsupervised: we have only a set of data points and the
goal is to find interesting patterns in the data
Example: young American males who buy diapers also have a predisposition
to buy beer (original story [3])
[2])
How does it work?
● Dataset of examples to learn from
● A model to learn from that data (e.g. neural net)
○ With some parameters to tune
○ With some hyperparameter to choose (neurons, layers, ...)
● Target function (loss) to minimize
cat: 0.9
lobster: 0.1
You are
[0.3, 0.2, 0.1]
wrong!
What is usually done
● Validation phase: compare different models and configurations
○ Which model to choose
○ Model hyper-parameters
● Test phase: loss, accuracy, recall, precision…
● We skip all this for seek of simplicity
OR?
Note: train/validation/test on different data
Models: feed-forward neural networks
virginica if argmax(net(input))==0 else setosa
non-linear
function
Note: A non-linear function ensure ANN to be a universal approximator [4]
Stack neurons in layers
Models: feed-forward neural networks
non-linear
function
*
*The optimizer will tune this
weights to minimize the loss
function in batch of examples
(btw we will use Adam [5])
Stack neurons in layers
virginica if argmax(net(input))==0 else setosa
A lot more stuff to know but for us...
UNDERSTANDING
MACHINE LEARNING
import keras
Practical ML
- Interactive
- Collaborative
- Python, R, Julia, Scala, ...
+ =
Easy to build a neural network
Easy to build a neural network wrong
Features:
Accuracy
Fitting
Performance
Keep an eye for:
Problem 1
Points classification
Generating the dataset
Generating the dataset
Plotting
Our model
● For non-linearity: rectifier linear
unit
● We use a softmax function in the
output layer to represent a
probability distribution
Let’s code!
https://colab.research.google.com
https://ml.anto.pt
Me after
training a neural
network
Back to theory - Convolving Lenna
● Given a function f, a convolution g with a kernel w is given by a very complex
formula with a very simple meaning: “adding each element of the image to its
local neighbors, weighted by the kernel” (wikipedia)
Note: check the manual implementation at link http://bit.ly/ml-dummies-lenna References: [11]
Convolution arithmetic
● Zero-padding: deal with borders pixels by adding zeros (preserves the size)
● Pooling: helps the network to become transformation invariant (translations,
rotations...) [7]
padding=same && no strides
GIF credits - https://github.com/vdumoulin/conv_arithmetic
max pooling && 2x2 strides
No padding, no strides
Dealing with multiple input channels
References: [12]
GoogLeNet on ImageNet - Feature visualization
feature visualization of
the 1s conv. layer
layer 3a layer 4d
References: [8,9,10]
Stack multiple filters and
learn kernels dynamically
(hierarchy of features)
Problem 2
Digits recognition
References
[1] Pattern Recognition in a Bucket
https://link.springer.com/chapter/10.1007/978-3-540-39432-7_63
[2] Iris dataset: https://archive.ics.uci.edu/ml/datasets/iris
[3] Beer and diapers: http://www.dssresources.com/newsletters/66.php
[4] Multilayer feedforward networks are universal approximators:
http://cognitivemedium.com/magic_paper/assets/Hornik.pdf
[5] Adam: A Method for Stochastic Optimization: https://arxiv.org/abs/1412.6980
[6] MNIST dataset: http://yann.lecun.com/exdb/mnist/
References
[7] Bengio, Yoshua, Ian Goodfellow, and Aaron Courville. Deep learning. Vol. 1.
MIT press, 2017: http://www.deeplearningbook.org/
[8] Feature-visualization: https://distill.pub/2017/feature-visualization/
[9] Going deeper with convolutions: https://arxiv.org/pdf/1409.4842.pdf
[10] Imagenet: A large-scale hierarchical image database: http://www.image-
net.org/papers/imagenet_cvpr09.pdf
[11] Culture, Communication, and an Information Age Madonna:
http://www.lenna.org/pcs_mirror/may_june01.pdf
References
[12] Intuitively Understanding Convolutions for Deep Learning:
https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep-
learning-1f6f42faee1
Books
Difficulty
Antonio Pitasi
Software Engineer, Nextworks
https://anto.pt
Samuele Sabella
https://github.com/samuelesabella
Antonio Pitasi
Software Engineer, Nextworks
https://anto.pt
Samuele Sabella
https://github.com/samuelesabella
Practical ML

More Related Content

Similar to Practical ML

Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
Amgad Muhammad
 

Similar to Practical ML (20)

Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learning
 
Build a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowBuild a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flow
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learning
 
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
 
Artificial Neural Network for machine learning
Artificial Neural Network for machine learningArtificial Neural Network for machine learning
Artificial Neural Network for machine learning
 
Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications
 
Deep Learning with Apache Spark: an Introduction
Deep Learning with Apache Spark: an IntroductionDeep Learning with Apache Spark: an Introduction
Deep Learning with Apache Spark: an Introduction
 
N ns 1
N ns 1N ns 1
N ns 1
 
Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep Learning
 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional Face
 
MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀
 
Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Introduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sIntroduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner's
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial Usecases
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 

Recently uploaded

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Recently uploaded (20)

WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
WSO2CON 2024 - IoT Needs CIAM: The Importance of Centralized IAM in a Growing...
WSO2CON 2024 - IoT Needs CIAM: The Importance of Centralized IAM in a Growing...WSO2CON 2024 - IoT Needs CIAM: The Importance of Centralized IAM in a Growing...
WSO2CON 2024 - IoT Needs CIAM: The Importance of Centralized IAM in a Growing...
 
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & InnovationWSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
 
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million PeopleWSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
 
WSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration ToolingWSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration Tooling
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Effective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeConEffective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeCon
 
WSO2CON 2024 - Building a Digital Government in Uganda
WSO2CON 2024 - Building a Digital Government in UgandaWSO2CON 2024 - Building a Digital Government in Uganda
WSO2CON 2024 - Building a Digital Government in Uganda
 
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
 

Practical ML

  • 1. Practical ML Antonio Pitasi Samuele Sabella 2019 - Polo Fibonacci
  • 2. Before we get started, we need some theory
  • 3. Machine learning ● Practice: We define machine learning as a set of methods that can automatically detect patterns in data, and then use the uncovered patterns to predict future data, or to perform other kinds of decision making under uncertainty ● Theory: How does learning performance vary with the number of training examples presented? Which learning algorithms are most appropriate for various types of learning tasks? - Machine Learning, Tom Mitchell - Machine Learning: A Probabilistic Perspective, Kevin P. Murphy
  • 4. ML is not only artificial neural networks ● Lots of mathematical models ○ Hidden Markov models ○ Support Vector Machines ○ Decision trees ○ Boltzmann machines, Deep belief network, Deep Boltzmann ● Neural network models are many... ○ Shallow network, Deep neural network ○ CNN (Yolo, AlexNet, GoogLeNet) ○ Echo state network, Deep echo state network ○ Rnn, LSTM, GRU
  • 5. Machine learning categories ● Supervised: the goal is to learn a mapping from input to output given a dataset of labeled pairs called training set (e.g. Iris Data Set ● Unsupervised: we have only a set of data points and the goal is to find interesting patterns in the data Example: young American males who buy diapers also have a predisposition to buy beer (original story [3]) [2])
  • 6. How does it work? ● Dataset of examples to learn from ● A model to learn from that data (e.g. neural net) ○ With some parameters to tune ○ With some hyperparameter to choose (neurons, layers, ...) ● Target function (loss) to minimize cat: 0.9 lobster: 0.1 You are [0.3, 0.2, 0.1] wrong!
  • 7. What is usually done ● Validation phase: compare different models and configurations ○ Which model to choose ○ Model hyper-parameters ● Test phase: loss, accuracy, recall, precision… ● We skip all this for seek of simplicity OR? Note: train/validation/test on different data
  • 8. Models: feed-forward neural networks virginica if argmax(net(input))==0 else setosa non-linear function Note: A non-linear function ensure ANN to be a universal approximator [4] Stack neurons in layers
  • 9. Models: feed-forward neural networks non-linear function * *The optimizer will tune this weights to minimize the loss function in batch of examples (btw we will use Adam [5]) Stack neurons in layers virginica if argmax(net(input))==0 else setosa
  • 10. A lot more stuff to know but for us... UNDERSTANDING MACHINE LEARNING import keras
  • 12.
  • 13.
  • 14. - Interactive - Collaborative - Python, R, Julia, Scala, ...
  • 15. + =
  • 16.
  • 17. Easy to build a neural network Easy to build a neural network wrong Features:
  • 20.
  • 21.
  • 25. Our model ● For non-linearity: rectifier linear unit ● We use a softmax function in the output layer to represent a probability distribution
  • 27. Me after training a neural network
  • 28. Back to theory - Convolving Lenna ● Given a function f, a convolution g with a kernel w is given by a very complex formula with a very simple meaning: “adding each element of the image to its local neighbors, weighted by the kernel” (wikipedia) Note: check the manual implementation at link http://bit.ly/ml-dummies-lenna References: [11]
  • 29. Convolution arithmetic ● Zero-padding: deal with borders pixels by adding zeros (preserves the size) ● Pooling: helps the network to become transformation invariant (translations, rotations...) [7] padding=same && no strides GIF credits - https://github.com/vdumoulin/conv_arithmetic max pooling && 2x2 strides No padding, no strides
  • 30. Dealing with multiple input channels References: [12]
  • 31. GoogLeNet on ImageNet - Feature visualization feature visualization of the 1s conv. layer layer 3a layer 4d References: [8,9,10] Stack multiple filters and learn kernels dynamically (hierarchy of features)
  • 33. References [1] Pattern Recognition in a Bucket https://link.springer.com/chapter/10.1007/978-3-540-39432-7_63 [2] Iris dataset: https://archive.ics.uci.edu/ml/datasets/iris [3] Beer and diapers: http://www.dssresources.com/newsletters/66.php [4] Multilayer feedforward networks are universal approximators: http://cognitivemedium.com/magic_paper/assets/Hornik.pdf [5] Adam: A Method for Stochastic Optimization: https://arxiv.org/abs/1412.6980 [6] MNIST dataset: http://yann.lecun.com/exdb/mnist/
  • 34. References [7] Bengio, Yoshua, Ian Goodfellow, and Aaron Courville. Deep learning. Vol. 1. MIT press, 2017: http://www.deeplearningbook.org/ [8] Feature-visualization: https://distill.pub/2017/feature-visualization/ [9] Going deeper with convolutions: https://arxiv.org/pdf/1409.4842.pdf [10] Imagenet: A large-scale hierarchical image database: http://www.image- net.org/papers/imagenet_cvpr09.pdf [11] Culture, Communication, and an Information Age Madonna: http://www.lenna.org/pcs_mirror/may_june01.pdf
  • 35. References [12] Intuitively Understanding Convolutions for Deep Learning: https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep- learning-1f6f42faee1
  • 37. Antonio Pitasi Software Engineer, Nextworks https://anto.pt Samuele Sabella https://github.com/samuelesabella
  • 38.
  • 39. Antonio Pitasi Software Engineer, Nextworks https://anto.pt Samuele Sabella https://github.com/samuelesabella

Editor's Notes

  1. The explanation goes that when fathers are sent out on an errand to buy diapers, they often purchase a six-pack of their favorite beer as a reward.
  2. It is important that g be non- linear, otherwise the whole model collapses into a large linear regression model of the form y = wT (Vx). One can show that an MLP is a universal approximator, meaning it can model any suitably smooth function, given enough hidden units, to any desired level of accuracy (Hornik 1991).
  3. It is important that g be non- linear, otherwise the whole model collapses into a large linear regression model of the form y = wT (Vx). One can show that an MLP is a universal approximator, meaning it can model any suitably smooth function, given enough hidden units, to any desired level of accuracy (Hornik 1991).
  4. Prima slide Pitasi
  5. Jupyter! Notebook open source per includere codice e il suo output
  6. Non solo codice e il suo output: immagini, markdown, latex, link
  7. Punti di forza dei notebook: condivisione e collaborazione E supporta linguaggi diversi da Python: al momento Jupyter supporta 40 linguaggi
  8. Per il workshop useremo i notebook online di Google: colab(orative)
  9. Recap: utilizzeremo i notebook di Google Colab Ma useremo anche una libreria: NUMPY - operazioni matematiche in modo efficiente (e.s matrici, vettori, …) Poi ci sarebbe SCIPY che implementa altri algoritmi e utilità matematiche PANDAS implementa strutture dati come tabelle o serie temporali MATPLOTLIB è la libreria fondamentale per la costruzione di grafici TENSORFLOW è la libreria creata da Google per machine learning KERAS è una libreria di alto livello che utilizza TENSORFLOW per fornire funzioni piuttosto semplici per fare machine learning
  10. Il nostro problema: distinguere i punti accumulati al centro dalla fascia che li circonda Partiamo generando un po’ di questi punti che useremo come training set, dopodichè ne genereremo altri come test set, e infine vedremo come si comporta la nostra rete
  11. Il nostro problema: distinguere i punti accumulati al centro dalla fascia che li circonda Partiamo generando un po’ di questi punti che useremo come training set, dopodichè ne genereremo altri come test set, e infine vedremo come si comporta la nostra rete
  12. It is important that g be non- linear, otherwise the whole model collapses into a large linear regression model of the form y = wT (Vx). One can show that an MLP is a universal approximator, meaning it can model any suitably smooth function, given enough hidden units, to any desired level of accuracy (Hornik 1991).
  13. Sobel operator (the one in the picture) return local intensity changes along one axis. In other words the kernel activates (maximum dot product) when placed over a patch of the image with high changes along one axis (border). A specific convolution kernel can be used to detect features in an image, for example the borders.