SlideShare a Scribd company logo
What is
Machine Learning?
14-10-2019
Contents
• Attempt at definition
• Applications
• Techniques
• Considerations
• Sources
• Software
• ML in R and Python
14-10-2019 page 2
14-10-2019 page 3
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Attempt at definition
Machine Learning
• “… gives computers the ability to learn without being explicitly
programmed” (Arthur Samuel, 1959)
• “… is the systematic study of algorithms and systems that improve their
knowledge or performance with experience” (Peter Flach, 2012)
• “… concerns systems that automatically learn programs from data” (Pedro
Domingos, 2012)
14-10-2019 page 4
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Related to ML
• Artificial Intelligence
• Knowledge discovery
• (Predictive) Analytics
• Statistics / Statistical Learning
• Optimization
• Evolutionary algorithms
• Deep Learning
• Data Mining
• Pattern recognition
• Data Science
• Informatics, computer/computational science
• Econometrics
• Related buzzwords: Big Data, Internet of Things
14-10-2019 page 5
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Related fields
Artificial Intelligence
Data Science
Statistics
Informatics
Econometrics
Optimization
14-10-2019 page 6
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Terminology
Statistics/Econometrics Machine Learning
Independent variables, predictors Features, inputs
Dependent variable Output, response
Estimation, fitting Training, learning
Dummy coding One-hot encoding
Transformation of variables Feature engineering
Parameters Weights
Regression/classification Supervised learning
Goal is to understand (model) Goal is to predict
14-10-2019 page 7
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
ML Applications
• Handwriting recognition
• Facial/image recognition
• Speech recognition
• Spam filters
• Text Mining
• DNA sequence classification
• Search engines
• Stock market analysis
• Game playing
• Medical diagnostics
• Fraud detection
• Passenger screening
• Crime prediction
• Satellite image classification
• Robotics
• Automatic flight pilots
• Self-driving cars
• …
14-10-2019 page 8
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
14-10-2019 page 9
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Recommendations
14-10-2019 page 10
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
“Object” recognition
in computer vision
14-10-2019 page 11
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
ML Techniques
Three groups:
• Supervised learning (classification, regression)
• Unsupervised learning (PCA, clustering, …)
• Reinforcement learning (agent-based)
• (Transfer learning)
• …
What do you need for a self-driving car?
14-10-2019 page 12
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Supervised techniques
Specific for Classification
• Decision trees
• Bagged trees
• Boosted trees
• Random Forests
• Neural networks
• Support Vector Machines
• Genetic programming
• Bayesian Networks
• MARS
• Lasso
• Logistic regression
• Naive Bayes
• kNN
• Ensemble models
• …
14-10-2019 page 13
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Techniques
14-10-2019 page 14
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Techniques
14-10-2019 page 15
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Example Unsupervised:
Anomaly detection
boundary case
outlier
extreme
case
Robust regression:
MVE estimation
14-10-2019 page 16
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Optimisation techniques
• Linear programming
• (Mixed) Integer Programming
• Non-linear programming
• Modern optimisation techniques:
14-10-2019 page 17
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Considerations
• Similarity between ML and statistical methods is often big:
representation – evaluation – optimisation
• Personal note: if it works well (prediction!), use it!
(but explainability may also matter)
• Data preparation (incl. feature engineering) often is 80% of the work
• Bias-Variance dilemma remains (overfitting)
• Perform fair comparison using ROC-curves on independent testset
• “No free lunch”: no single technique is always best
=> Use expert knowledge and choose representation fitting the problem
(data alone is not enough)
• Curse of dimensionality: input space grows exponentially with k, the
number of observations (generally) does not
• Consider making multiple models and combining (ensembles)
14-10-2019 page 18
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Sources
• Literature (use Google Scholar and arXiv)
• Data (Kaggle, UCI, Quandl, governments, APIs)
• Competitions (Kaggle, Topcoder, HackerRank, CrowdAnalytix)
• Courses (Coursera, Udacity, Udemy, DataCamp)
• Academic education (A’dam School of Data Science,
Eindhoven, Delft, Tilburg)
• Fora (Kaggle, Stackoverflow, Quora)
• Other websites (Analytics Vidhya, Data Science Central,
DeepMind, DutchDigitalDelta-Commit2data)
•
14-10-2019 page 19
s (e.g. ADS and AMDS)
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Literature
• Articles
– A Few Useful Things to Know About Machine Learning (Pedro Domingos, CACM Oct 2012)
– Statistical Modeling: The Two Cultures (Leo Breiman, Statistical Science 2001)
• Books
– The Elements of Statistical Learning (Hastie/Tibshirani/Friedman; Springer
2008)
– Applied Predictive Modeling (Kuhn/Johnson; Springer 2013)
– Machine Learning (Flach; Cambridge Univ. Press 2012)
– Reinforcement Learning: An Introduction (Sutton/Barto; MIT Press 2012)
– Artificial Intelligence: A Modern Approach 3rd ed. (Russell/Norvig; Prentice Hall
2016)
– Modern Optimization with R (Cortez; Springer 2014)
14-10-2019 page 20
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Software
• General purpose programming languages
– Python
– R
– SAS
– Matlab
• ML environments/libraries
– MS Azure
– Google Tensorflow (for R)
– AWS: Amazon Web Services
14-10-2019 page 21
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Machine Learning in R
• CRAN Task View:
Machine Learning & Statistical Learning
• Caret package
– Vignette
– Many model types
– Training and prediction
– Variable importance
– Parameter tuning
– Cross-Validation, ROC curves, plots
– etc.
• Tensorflow interface (via Python)
14-10-2019 page 22
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
Machine Learning in Python
14-10-2019 page 23
Definition
Applications
Techniques
Considerations
Sources
Software
ML in R & Python
• scikit-learn
• pytorch
(for deep learning)
• (auto-ml)
Reinforcement Learning
• MDP: Markov Decision Process
• Environment (S,A,P,R) entirely or partly known
• Packages in R
– MDPtoolbox
– ReinforcementLearning
• Code in Python
– Lots on github,
e.g. DeepMind TRFL
• Self coding
14-10-2019 page 24
Hype or Hope?
14-10-2019 page 25
Backup Slides
14-10-2019 page 26
Hype or Hope?
14-10-2019 page 27
Hype or Hope?
14-10-2019 page 28

More Related Content

What's hot

Synthetic Data Generation with DoppelGanger
Synthetic Data Generation with DoppelGangerSynthetic Data Generation with DoppelGanger
Synthetic Data Generation with DoppelGanger
QuantUniversity
 
Towards XMAS: eXplainability through Multi-Agent Systems
Towards XMAS: eXplainability through Multi-Agent SystemsTowards XMAS: eXplainability through Multi-Agent Systems
Towards XMAS: eXplainability through Multi-Agent Systems
Giovanni Ciatto
 
achine Learning and Model Risk
achine Learning and Model Riskachine Learning and Model Risk
achine Learning and Model Risk
QuantUniversity
 
Ml master class
Ml master classMl master class
Ml master class
QuantUniversity
 
Rapid prototyping quant research ml models using the qu sandbox
Rapid prototyping quant research ml models using the qu sandboxRapid prototyping quant research ml models using the qu sandbox
Rapid prototyping quant research ml models using the qu sandbox
QuantUniversity
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...
semanticsconference
 
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
QuantUniversity
 
Ml master class cfa poland
Ml master class   cfa polandMl master class   cfa poland
Ml master class cfa poland
QuantUniversity
 
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
Open Cyber University of Korea
 
Synthetic data in finance
Synthetic data in financeSynthetic data in finance
Synthetic data in finance
QuantUniversity
 
Qu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in FinanceQu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in Finance
QuantUniversity
 
QCon conference 2019
QCon conference 2019QCon conference 2019
QCon conference 2019
QuantUniversity
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
ijfcst journal
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
ijfcst journal
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
ijfcst journal
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
ijfcst journal
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
ijfcst journal
 

What's hot (17)

Synthetic Data Generation with DoppelGanger
Synthetic Data Generation with DoppelGangerSynthetic Data Generation with DoppelGanger
Synthetic Data Generation with DoppelGanger
 
Towards XMAS: eXplainability through Multi-Agent Systems
Towards XMAS: eXplainability through Multi-Agent SystemsTowards XMAS: eXplainability through Multi-Agent Systems
Towards XMAS: eXplainability through Multi-Agent Systems
 
achine Learning and Model Risk
achine Learning and Model Riskachine Learning and Model Risk
achine Learning and Model Risk
 
Ml master class
Ml master classMl master class
Ml master class
 
Rapid prototyping quant research ml models using the qu sandbox
Rapid prototyping quant research ml models using the qu sandboxRapid prototyping quant research ml models using the qu sandbox
Rapid prototyping quant research ml models using the qu sandbox
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...
 
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
 
Ml master class cfa poland
Ml master class   cfa polandMl master class   cfa poland
Ml master class cfa poland
 
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
 
Synthetic data in finance
Synthetic data in financeSynthetic data in finance
Synthetic data in finance
 
Qu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in FinanceQu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in Finance
 
QCon conference 2019
QCon conference 2019QCon conference 2019
QCon conference 2019
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
 
International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)International Journal in Foundations of Computer Science & Technology(IJFCST)
International Journal in Foundations of Computer Science & Technology(IJFCST)
 

Similar to Oct2019 - What is machine learning?

Week 2 lecture
Week 2 lectureWeek 2 lecture
Week 2 lecture
RameshChandraPooniaC
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
SoftServe
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area ML
Paco Nathan
 
Machine learning using Python IT Learning 2020
Machine learning using Python IT Learning 2020Machine learning using Python IT Learning 2020
Machine learning using Python IT Learning 2020
Jeevan Chavan
 
Human in the loop: a design pattern for managing teams working with ML
Human in the loop: a design pattern for managing  teams working with MLHuman in the loop: a design pattern for managing  teams working with ML
Human in the loop: a design pattern for managing teams working with ML
Paco Nathan
 
Data Science Course in Pune
Data Science Course in Pune Data Science Course in Pune
Data Science Course in Pune
nmdfilmProduction
 
intro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabiintro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabi
botvillain45
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
Julian Bright
 
1 machine learning demystified
1 machine learning demystified1 machine learning demystified
1 machine learning demystified
Dr Nisha Arora
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
 
Inclusive Futures for Europe. Beyond the Impacts of Industry 4.0 and Digital ...
Inclusive Futures for Europe. Beyond the Impacts of Industry 4.0 and Digital ...Inclusive Futures for Europe. Beyond the Impacts of Industry 4.0 and Digital ...
Inclusive Futures for Europe. Beyond the Impacts of Industry 4.0 and Digital ...
BEYOND4.0
 
Data scientist What is inside it?
Data scientist What is inside it?Data scientist What is inside it?
Data scientist What is inside it?
Praveen Kumar (Tyagi)
 
Tutorial helsinki 20180313 v1
Tutorial helsinki 20180313 v1Tutorial helsinki 20180313 v1
Tutorial helsinki 20180313 v1
ISSIP
 
Artificial intelligence engineer course
Artificial intelligence engineer courseArtificial intelligence engineer course
Artificial intelligence engineer course
Ibrahim Khleifat
 
2016 03-16 digital energy luncheon
2016 03-16 digital energy luncheon2016 03-16 digital energy luncheon
2016 03-16 digital energy luncheon
Mark Reynolds
 
Course - Machine Learning Basics with R
Course - Machine Learning Basics with R Course - Machine Learning Basics with R
Course - Machine Learning Basics with R
Persontyle
 
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdfR18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
Naveen Kumar
 
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Ed Fernandez
 
Best Data Science Hybrid Course in Pune.
Best Data Science Hybrid Course in Pune.Best Data Science Hybrid Course in Pune.
Best Data Science Hybrid Course in Pune.
3RI Technologies Pvt Ltd
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressed
Bonnie Holub
 

Similar to Oct2019 - What is machine learning? (20)

Week 2 lecture
Week 2 lectureWeek 2 lecture
Week 2 lecture
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area ML
 
Machine learning using Python IT Learning 2020
Machine learning using Python IT Learning 2020Machine learning using Python IT Learning 2020
Machine learning using Python IT Learning 2020
 
Human in the loop: a design pattern for managing teams working with ML
Human in the loop: a design pattern for managing  teams working with MLHuman in the loop: a design pattern for managing  teams working with ML
Human in the loop: a design pattern for managing teams working with ML
 
Data Science Course in Pune
Data Science Course in Pune Data Science Course in Pune
Data Science Course in Pune
 
intro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabiintro to ML by the way m toh phasee movie Punjabi
intro to ML by the way m toh phasee movie Punjabi
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
 
1 machine learning demystified
1 machine learning demystified1 machine learning demystified
1 machine learning demystified
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Inclusive Futures for Europe. Beyond the Impacts of Industry 4.0 and Digital ...
Inclusive Futures for Europe. Beyond the Impacts of Industry 4.0 and Digital ...Inclusive Futures for Europe. Beyond the Impacts of Industry 4.0 and Digital ...
Inclusive Futures for Europe. Beyond the Impacts of Industry 4.0 and Digital ...
 
Data scientist What is inside it?
Data scientist What is inside it?Data scientist What is inside it?
Data scientist What is inside it?
 
Tutorial helsinki 20180313 v1
Tutorial helsinki 20180313 v1Tutorial helsinki 20180313 v1
Tutorial helsinki 20180313 v1
 
Artificial intelligence engineer course
Artificial intelligence engineer courseArtificial intelligence engineer course
Artificial intelligence engineer course
 
2016 03-16 digital energy luncheon
2016 03-16 digital energy luncheon2016 03-16 digital energy luncheon
2016 03-16 digital energy luncheon
 
Course - Machine Learning Basics with R
Course - Machine Learning Basics with R Course - Machine Learning Basics with R
Course - Machine Learning Basics with R
 
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdfR18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
R18B.Tech.CSE(DataScience)IIIIVYearTentativeSyllabus.pdf
 
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
 
Best Data Science Hybrid Course in Pune.
Best Data Science Hybrid Course in Pune.Best Data Science Hybrid Course in Pune.
Best Data Science Hybrid Course in Pune.
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressed
 

Recently uploaded

Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 

Recently uploaded (20)

Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 

Oct2019 - What is machine learning?

  • 2. Contents • Attempt at definition • Applications • Techniques • Considerations • Sources • Software • ML in R and Python 14-10-2019 page 2
  • 4. Attempt at definition Machine Learning • “… gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959) • “… is the systematic study of algorithms and systems that improve their knowledge or performance with experience” (Peter Flach, 2012) • “… concerns systems that automatically learn programs from data” (Pedro Domingos, 2012) 14-10-2019 page 4 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 5. Related to ML • Artificial Intelligence • Knowledge discovery • (Predictive) Analytics • Statistics / Statistical Learning • Optimization • Evolutionary algorithms • Deep Learning • Data Mining • Pattern recognition • Data Science • Informatics, computer/computational science • Econometrics • Related buzzwords: Big Data, Internet of Things 14-10-2019 page 5 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 6. Related fields Artificial Intelligence Data Science Statistics Informatics Econometrics Optimization 14-10-2019 page 6 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 7. Terminology Statistics/Econometrics Machine Learning Independent variables, predictors Features, inputs Dependent variable Output, response Estimation, fitting Training, learning Dummy coding One-hot encoding Transformation of variables Feature engineering Parameters Weights Regression/classification Supervised learning Goal is to understand (model) Goal is to predict 14-10-2019 page 7 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 8. ML Applications • Handwriting recognition • Facial/image recognition • Speech recognition • Spam filters • Text Mining • DNA sequence classification • Search engines • Stock market analysis • Game playing • Medical diagnostics • Fraud detection • Passenger screening • Crime prediction • Satellite image classification • Robotics • Automatic flight pilots • Self-driving cars • … 14-10-2019 page 8 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 11. “Object” recognition in computer vision 14-10-2019 page 11 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 12. ML Techniques Three groups: • Supervised learning (classification, regression) • Unsupervised learning (PCA, clustering, …) • Reinforcement learning (agent-based) • (Transfer learning) • … What do you need for a self-driving car? 14-10-2019 page 12 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 13. Supervised techniques Specific for Classification • Decision trees • Bagged trees • Boosted trees • Random Forests • Neural networks • Support Vector Machines • Genetic programming • Bayesian Networks • MARS • Lasso • Logistic regression • Naive Bayes • kNN • Ensemble models • … 14-10-2019 page 13 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 16. Example Unsupervised: Anomaly detection boundary case outlier extreme case Robust regression: MVE estimation 14-10-2019 page 16 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 17. Optimisation techniques • Linear programming • (Mixed) Integer Programming • Non-linear programming • Modern optimisation techniques: 14-10-2019 page 17 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 18. Considerations • Similarity between ML and statistical methods is often big: representation – evaluation – optimisation • Personal note: if it works well (prediction!), use it! (but explainability may also matter) • Data preparation (incl. feature engineering) often is 80% of the work • Bias-Variance dilemma remains (overfitting) • Perform fair comparison using ROC-curves on independent testset • “No free lunch”: no single technique is always best => Use expert knowledge and choose representation fitting the problem (data alone is not enough) • Curse of dimensionality: input space grows exponentially with k, the number of observations (generally) does not • Consider making multiple models and combining (ensembles) 14-10-2019 page 18 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 19. Sources • Literature (use Google Scholar and arXiv) • Data (Kaggle, UCI, Quandl, governments, APIs) • Competitions (Kaggle, Topcoder, HackerRank, CrowdAnalytix) • Courses (Coursera, Udacity, Udemy, DataCamp) • Academic education (A’dam School of Data Science, Eindhoven, Delft, Tilburg) • Fora (Kaggle, Stackoverflow, Quora) • Other websites (Analytics Vidhya, Data Science Central, DeepMind, DutchDigitalDelta-Commit2data) • 14-10-2019 page 19 s (e.g. ADS and AMDS) Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 20. Literature • Articles – A Few Useful Things to Know About Machine Learning (Pedro Domingos, CACM Oct 2012) – Statistical Modeling: The Two Cultures (Leo Breiman, Statistical Science 2001) • Books – The Elements of Statistical Learning (Hastie/Tibshirani/Friedman; Springer 2008) – Applied Predictive Modeling (Kuhn/Johnson; Springer 2013) – Machine Learning (Flach; Cambridge Univ. Press 2012) – Reinforcement Learning: An Introduction (Sutton/Barto; MIT Press 2012) – Artificial Intelligence: A Modern Approach 3rd ed. (Russell/Norvig; Prentice Hall 2016) – Modern Optimization with R (Cortez; Springer 2014) 14-10-2019 page 20 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 21. Software • General purpose programming languages – Python – R – SAS – Matlab • ML environments/libraries – MS Azure – Google Tensorflow (for R) – AWS: Amazon Web Services 14-10-2019 page 21 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 22. Machine Learning in R • CRAN Task View: Machine Learning & Statistical Learning • Caret package – Vignette – Many model types – Training and prediction – Variable importance – Parameter tuning – Cross-Validation, ROC curves, plots – etc. • Tensorflow interface (via Python) 14-10-2019 page 22 Definition Applications Techniques Considerations Sources Software ML in R & Python
  • 23. Machine Learning in Python 14-10-2019 page 23 Definition Applications Techniques Considerations Sources Software ML in R & Python • scikit-learn • pytorch (for deep learning) • (auto-ml)
  • 24. Reinforcement Learning • MDP: Markov Decision Process • Environment (S,A,P,R) entirely or partly known • Packages in R – MDPtoolbox – ReinforcementLearning • Code in Python – Lots on github, e.g. DeepMind TRFL • Self coding 14-10-2019 page 24