SlideShare a Scribd company logo
Machine Learning
Big picture
Francis Pieraut – Oct 2016
My startup bias about efficiency
Plan
1. My path in ML
2. AI big picture (expert systems to ML)
3. ML trends over time 1980-2008+
4. Type of ML (supervised vs unsupervised)
5. Relationship Data-mining vs ML
6. Training process
7. regularization technique
8. ML research big picture
9. What is this Deeplearning Revolution?
10. ML in practice -> feature engineering
11. Importance of the cost function
12. Data importance -> NIPS 2009
13. The tagging nightmare
14. ML & Optimization
15. Adversarial Examples
Francis Evolution in ML
• 1999 – Decision Tree expert (Samy Bengio)
• 2001-2003 – Research with Bengio (huge networks) -> flayers
• 2003 – Idilia -> Importance of good tagged dataset and features &
overfitting
• 2005-2006 – Dakis -> KISS (ML not required & importance of
comprehensive knowledge) – Expert System
• 2006-2009 – Data-Mining (Understand first & features extraction)…MLboost
• 2010-2013 – QMining -> big-data mining
• 2003-2016 – Nuance -> Data Maturity & Data-driven design
Data Maturity model reminder
AI big picture
Type of ML
Parametric Non-Parametric
Reinforcement
ML trends over time 1980-2008+
http://fraka6.blogspot.com/2013/10/deep-learning-history-and-most.html
10 main ML algo
• Naïve Bayes Classifier Algorithm
• K Means Clustering Algorithm
• Support Vector Machine Algorithm
• Apriori Algorithm
• Linear Regression
• Logistic Regression
• Artificial Neural Networks (gradient)
• Random Forests
• Decision Trees (info theory)
• K Nearest Neighbors
***Machine learning
dangerous hype****
Traps and Pitfalls
Data-Mining vs Machine Learning
Traininig Process
Classification error over time
Training
Regularisation technique
• Regularization is a technique used in an
attempt to solve the overfitting [1]
problem
in statistical models.*
• Exemple:
– Early stopping
– Decrease constant
– Dropout
– Mini-batch
– Better cost function (ex: margin vs MSE)
What is tough about ML
• More parameters = more examples are
required
• Tagged data is hard to create compare to
untagged data
• There is no magic -> Feature engineering
• Better features -> less examples -> less
capacity problem
• Getting good example sampling (don’t
introduce bias)
Example feature engineering
Example feature engineering
What is this Deeplearning Revolution?
• Deep architecture are more powerful then shallow
architecture
• Before 2006 we couldn’t train deep architecture
• Revolution
– Convolution NN
– Train generative models (Auto-encoder) -> learn the
data constraints…..unsupervised learning… (better
parameters initialization)
– STD Training
Example of deep learning in images
ML learning in practice
• Black box = recipe for a disaster
• 90% feature engineering
• ML = automatic tuning
• Garbage in = Garbage out
• Tagging is a pain….manual work
Importance of the cost function
• Neural network cost functions (back prop)
– MSE & Log soft max
– Example NETFLIX & recommendation
• Optimization
– SVM = Maximize Margin
Data Importance NIPS 2009
• Google -> that is enough
– Parameter optimization; tweaking kernels (SVM)
– More parameters then # examples
– Simpler model + more data = what works
The tagging nightmare
• You still need tagged data
• Tagged data is hard to automate + error
prone
• Tagged data is error prone (garbage in
garbage out)
– Idilia use case
– Nuance use case
The lie about ML
• Machine learning != Optimization
• Machine learning != Statistics
• Machine learning = Optimization problem
with constraints to generalize
(regularization)
Adversarial example - Ian Goodfellow
(now at open.ai)
Conclusion
• ML is a quite mature field
• ML != Deeplearning
– Deeplearning = major breakthrough, hype
phase, not mature
• NN = optimization problem with constraints
• SP operates more like expert systems
• Algo is as good as its inputs -> feature
engineering
QUESTIONS
francis@qmining.com
hum...

More Related Content

Similar to ML_big_picture-2.0.pptx

MLIntro_ADA.pptx
MLIntro_ADA.pptxMLIntro_ADA.pptx
MLIntro_ADA.pptx
ADA Consulting
 
Storage Challenges for Production Machine Learning
Storage Challenges for Production Machine LearningStorage Challenges for Production Machine Learning
Storage Challenges for Production Machine Learning
Nisha Talagala
 
machine learning workflow with data input.pptx
machine learning workflow with data input.pptxmachine learning workflow with data input.pptx
machine learning workflow with data input.pptx
jasontseng19
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
Ivo Andreev
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
Ivo Andreev
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Edunomica
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
Mohit Garg
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learning
Ivo Andreev
 
Quest for machine intelligence: Statistical learning methods
Quest for machine intelligence: Statistical learning methodsQuest for machine intelligence: Statistical learning methods
Quest for machine intelligence: Statistical learning methods
Pavel Loskot
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PM
Product School
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
zekeLabs Technologies
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
Charles Vestur
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Ganesan Narayanasamy
 
artificial intelligence.pptx
artificial intelligence.pptxartificial intelligence.pptx
artificial intelligence.pptx
rithika858339
 
What is Machine Learning Operations (MLOps)?
What is Machine Learning Operations (MLOps)?What is Machine Learning Operations (MLOps)?
What is Machine Learning Operations (MLOps)?
Leonardo Moraes
 
M.tech cse 10july13 (1)
M.tech cse  10july13 (1)M.tech cse  10july13 (1)
M.tech cse 10july13 (1)
vijay707070
 
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Knobbe Martens - Intellectual Property Law
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
Miroslaw Staron
 
Msst 2019 v4
Msst 2019 v4Msst 2019 v4
Msst 2019 v4
Nisha Talagala
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science
Venkata Reddy Konasani
 

Similar to ML_big_picture-2.0.pptx (20)

MLIntro_ADA.pptx
MLIntro_ADA.pptxMLIntro_ADA.pptx
MLIntro_ADA.pptx
 
Storage Challenges for Production Machine Learning
Storage Challenges for Production Machine LearningStorage Challenges for Production Machine Learning
Storage Challenges for Production Machine Learning
 
machine learning workflow with data input.pptx
machine learning workflow with data input.pptxmachine learning workflow with data input.pptx
machine learning workflow with data input.pptx
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learning
 
Quest for machine intelligence: Statistical learning methods
Quest for machine intelligence: Statistical learning methodsQuest for machine intelligence: Statistical learning methods
Quest for machine intelligence: Statistical learning methods
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PM
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
 
artificial intelligence.pptx
artificial intelligence.pptxartificial intelligence.pptx
artificial intelligence.pptx
 
What is Machine Learning Operations (MLOps)?
What is Machine Learning Operations (MLOps)?What is Machine Learning Operations (MLOps)?
What is Machine Learning Operations (MLOps)?
 
M.tech cse 10july13 (1)
M.tech cse  10july13 (1)M.tech cse  10july13 (1)
M.tech cse 10july13 (1)
 
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Msst 2019 v4
Msst 2019 v4Msst 2019 v4
Msst 2019 v4
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science
 

More from Francis Piéraut

4th industrial revolution fuel by combining big data and deeplearning a qui...
4th industrial revolution fuel by combining big data and deeplearning   a qui...4th industrial revolution fuel by combining big data and deeplearning   a qui...
4th industrial revolution fuel by combining big data and deeplearning a qui...
Francis Piéraut
 
Startups ultime experience
Startups ultime experienceStartups ultime experience
Startups ultime experience
Francis Piéraut
 
The ultimate trick to learn faster
The ultimate trick  to learn fasterThe ultimate trick  to learn faster
The ultimate trick to learn faster
Francis Piéraut
 
Big data barrier of entry (flash)
Big data barrier of entry (flash) Big data barrier of entry (flash)
Big data barrier of entry (flash)
Francis Piéraut
 
Big data trap
Big data trapBig data trap
Big data trap
Francis Piéraut
 
The big data dead valley dilemma and much more.
The big data dead valley dilemma and much more.The big data dead valley dilemma and much more.
The big data dead valley dilemma and much more.
Francis Piéraut
 
Appengine vs Amazon; pros & cons for startups
Appengine vs Amazon; pros & cons for startupsAppengine vs Amazon; pros & cons for startups
Appengine vs Amazon; pros & cons for startups
Francis Piéraut
 
No BI without Machine Learning
No BI without Machine LearningNo BI without Machine Learning
No BI without Machine Learning
Francis Piéraut
 
Java Empowered by Jython
Java Empowered by JythonJava Empowered by Jython
Java Empowered by Jython
Francis Piéraut
 
Master Defense Slides (translated)
Master Defense Slides (translated)Master Defense Slides (translated)
Master Defense Slides (translated)
Francis Piéraut
 

More from Francis Piéraut (10)

4th industrial revolution fuel by combining big data and deeplearning a qui...
4th industrial revolution fuel by combining big data and deeplearning   a qui...4th industrial revolution fuel by combining big data and deeplearning   a qui...
4th industrial revolution fuel by combining big data and deeplearning a qui...
 
Startups ultime experience
Startups ultime experienceStartups ultime experience
Startups ultime experience
 
The ultimate trick to learn faster
The ultimate trick  to learn fasterThe ultimate trick  to learn faster
The ultimate trick to learn faster
 
Big data barrier of entry (flash)
Big data barrier of entry (flash) Big data barrier of entry (flash)
Big data barrier of entry (flash)
 
Big data trap
Big data trapBig data trap
Big data trap
 
The big data dead valley dilemma and much more.
The big data dead valley dilemma and much more.The big data dead valley dilemma and much more.
The big data dead valley dilemma and much more.
 
Appengine vs Amazon; pros & cons for startups
Appengine vs Amazon; pros & cons for startupsAppengine vs Amazon; pros & cons for startups
Appengine vs Amazon; pros & cons for startups
 
No BI without Machine Learning
No BI without Machine LearningNo BI without Machine Learning
No BI without Machine Learning
 
Java Empowered by Jython
Java Empowered by JythonJava Empowered by Jython
Java Empowered by Jython
 
Master Defense Slides (translated)
Master Defense Slides (translated)Master Defense Slides (translated)
Master Defense Slides (translated)
 

ML_big_picture-2.0.pptx

  • 2.
  • 3.
  • 4.
  • 5. My startup bias about efficiency
  • 6. Plan 1. My path in ML 2. AI big picture (expert systems to ML) 3. ML trends over time 1980-2008+ 4. Type of ML (supervised vs unsupervised) 5. Relationship Data-mining vs ML 6. Training process 7. regularization technique 8. ML research big picture 9. What is this Deeplearning Revolution? 10. ML in practice -> feature engineering 11. Importance of the cost function 12. Data importance -> NIPS 2009 13. The tagging nightmare 14. ML & Optimization 15. Adversarial Examples
  • 7. Francis Evolution in ML • 1999 – Decision Tree expert (Samy Bengio) • 2001-2003 – Research with Bengio (huge networks) -> flayers • 2003 – Idilia -> Importance of good tagged dataset and features & overfitting • 2005-2006 – Dakis -> KISS (ML not required & importance of comprehensive knowledge) – Expert System • 2006-2009 – Data-Mining (Understand first & features extraction)…MLboost • 2010-2013 – QMining -> big-data mining • 2003-2016 – Nuance -> Data Maturity & Data-driven design
  • 8.
  • 11. Type of ML Parametric Non-Parametric Reinforcement
  • 12. ML trends over time 1980-2008+ http://fraka6.blogspot.com/2013/10/deep-learning-history-and-most.html
  • 13. 10 main ML algo • Naïve Bayes Classifier Algorithm • K Means Clustering Algorithm • Support Vector Machine Algorithm • Apriori Algorithm • Linear Regression • Logistic Regression • Artificial Neural Networks (gradient) • Random Forests • Decision Trees (info theory) • K Nearest Neighbors ***Machine learning dangerous hype****
  • 17. Regularisation technique • Regularization is a technique used in an attempt to solve the overfitting [1] problem in statistical models.* • Exemple: – Early stopping – Decrease constant – Dropout – Mini-batch – Better cost function (ex: margin vs MSE)
  • 18. What is tough about ML • More parameters = more examples are required • Tagged data is hard to create compare to untagged data • There is no magic -> Feature engineering • Better features -> less examples -> less capacity problem • Getting good example sampling (don’t introduce bias)
  • 21. What is this Deeplearning Revolution? • Deep architecture are more powerful then shallow architecture • Before 2006 we couldn’t train deep architecture • Revolution – Convolution NN – Train generative models (Auto-encoder) -> learn the data constraints…..unsupervised learning… (better parameters initialization) – STD Training
  • 22.
  • 23.
  • 24. Example of deep learning in images
  • 25. ML learning in practice • Black box = recipe for a disaster • 90% feature engineering • ML = automatic tuning • Garbage in = Garbage out • Tagging is a pain….manual work
  • 26. Importance of the cost function • Neural network cost functions (back prop) – MSE & Log soft max – Example NETFLIX & recommendation • Optimization – SVM = Maximize Margin
  • 27. Data Importance NIPS 2009 • Google -> that is enough – Parameter optimization; tweaking kernels (SVM) – More parameters then # examples – Simpler model + more data = what works
  • 28. The tagging nightmare • You still need tagged data • Tagged data is hard to automate + error prone • Tagged data is error prone (garbage in garbage out) – Idilia use case – Nuance use case
  • 29. The lie about ML • Machine learning != Optimization • Machine learning != Statistics • Machine learning = Optimization problem with constraints to generalize (regularization)
  • 30. Adversarial example - Ian Goodfellow (now at open.ai)
  • 31. Conclusion • ML is a quite mature field • ML != Deeplearning – Deeplearning = major breakthrough, hype phase, not mature • NN = optimization problem with constraints • SP operates more like expert systems • Algo is as good as its inputs -> feature engineering