Dr Nisha Arora
Machine Learning Demystified
Image credit - Google Images
Contents
2
 What is Machine Learning?
 Machine Learning Applications
 Pre-requisites for ML
 Learning ML
 ML Algorithm Categorization
 Supervised, Un-supervised & Semi-supervised Algorithms
 Parametric & Non-Parametric Algorithms
 The ML Framework
 Machine Learning Tools
 R Vs Python Vs SAS
 Reference & Resources
Quotes
3
Quotes
4
Quotes
5
Quotes
6
7
 “Machine learning is the next Internet”
(Tony Tether, Director, DARPA)
 “Machine learning is the hot new thing”
(John Hennessy, President, Stanford)
 “Web rankings today are mostly a matter of machine
learning” (Prabhakar Raghavan, Dir. Research, Yahoo)
 “Machine learning is today’s discontinuity”
(Jerry Yang, CEO, Yahoo)
Quotes
Traditional Programming
Machine Learning
Computer
Data
Program
Output
Computer
Data
Output
Program
Programming Vs Machine Learning
10
Machine Learning is the
 Study of algorithms that
 Improve their performance
 At some task
 With experience
 Machine learning is the science of getting computers to act
without being explicitly programmed – Andrew Ng
 Machine Learning is the training of a model from data that
generalizes a decision against a performance measure – Jason in
his blog
What is Machine Learning?
11
 The field of machine learning is concerned with the question of
how to construct computer programs that automatically improve
with experience - Tom Mitchell in his book Machine Learning
 Vast amounts of data are being generated in many fields, and the
statisticians’s job is to make sense of it all: to extract important
patterns and trends, and to understand “what the data says”. We
call this learning from data - Hastie, Tibshirani & Friedman
Other Definitions
12
Industries using ML
13
Machine Learning Applications
 Finance
 Credit Scoring
 Fraud Detection
 Retail
 Market Basket Analysis
 Customer Relationship Management (CRM)
 Marketing
 Customer Segmentation
 Customer churn prediction
14
Machine Learning Applications
 Human Resource management
 Internal mobility (Matching of employees and job descriptions)
 External recruitment (Browsing candidates through social platforms
like linkedIn)
 Skills management: mapping all company skills (including their
relationships e.g. one programming language being close to another
one)
15
Machine Learning Applications
 Automating employee access control by Amazon
 To develop a computer algorithm that will predict which employees
should be granted access to what resources.
 To identify whales in the ocean based on audio recordings by Cornell
University
 so that ships can avoid hitting them
 To determine which bird species is/are on a given audio recording
collected in field conditions by Oregon State University
 Identifying heart failure by IBM researchers
 A way to extract heart failure diagnosis criteria from free-text
physician notes.
16
Machine Learning Applications
 To predict whether someone is a psychopath based on his twitter usage
 To predict in advance whether a product launch will be successful or not.
 Stop malware
 Self driving cars
 Automatic speech recognition, automatic voice/face/finger print
recognition, automatic medical diagnostics and many more
Growth of Machine Learning
The rise and shine of machine learning is due to
 Improved data capture, networking, faster computers
 New sensors / Input Output devices
 Demand for self-customization to user, environment
 Improved machine learning algorithms
18
Data Science Skills
19
Data Science Skills
20
Programming
Mathematics/ Numerical Optimization
Linear Algebra/ Calculus
Introductory Statistics
Prerequisites
21
Do you really need to be an expert in math?
The real prerequisite for machine learning isn’t math, it’s data
analysis
https://www.r-bloggers.com/the-real-prerequisite-for-machine-
learning-isnt-math-its-data-analysis/
http://courses.washington.edu/css490/2012.Winter/lecture_slide
s/02_math_essentials.pdf
22
Do you really need to be an expert in programming?
You do not have to be an excellent programmer to start your
career in machine learning
http://machinelearningmastery.com/what-if-im-not-a-good-
programmer/
23
Adapt, improvise and conquer
Finish what you start no matter what
Start even when you are not ready
Passion for machine learning
Prerequisites
https://www.quora.com/What-are-prerequisites-to-start-learning-Machine-Learning
24
Learning ML
Adjust
Mindset (believe!)
Pick a Process Pick a Tool
Practice on
Datasets
Build a Portfolio
http://machinelearningmastery.com/machine-learning-mastery-method/
25
ML Algorithm Categorization
On the basis of learning style
Supervised
Learning
• Regression,
• Logistic
regression
• Back
propagation
neural network
Unsupervised
Learning
• K- means
clustering
• Association
rules
Semi-
supervised
Learning
• Regression
• Classification
26 Image Credit: Pinterest
27 Image Credit: PWC
28
Supervised Learning
29
Supervised Learning
30
Un-Supervised Learning
31
Semi-Supervised Learning
32
More Examples
33
ML Algorithm Categorization
On the basis of similarity in form/function
Regression
Classification
Clustering
Anomaly
Detection
Recommender
System
Regularization
Tree Based
Algo
ANN
Deep Learning
34
ML Algorithm Categorization
On the basis of similarity in form/function, there are many other
algorithms such as ensemble methods, reinforcement learning,
computer vision, natural language processing (NLP),
dimensionality reduction algorithms, Baysian algorithms, &
Instance based algorithms etc.
Read more: http://www.cs.uvm.edu/~icdm/algorithms/index.shtml
35
ML Algorithm Categorization
On the basis of target function
Parametric
Algorithms
• Algorithms that simplify the
function to a known form
• E.g., Linear Reg, Logistic Reg.
Non-Parametric
Algorithms
• Algorithms that don’t make strong
assumptions about the form of
the function
• E.g., SVM, ANN, Decision Trees
36
Parametric Vs Non-Parametric Algorithms
Parametric
Algorithms
Simple
Speed
Less data
Non-Parametric
Algorithms
Difficult to
interpret
Slower
More data
37
Parametric Vs Non-Parametric Algorithms
Parametric
Algorithms
Constrained
Limited Complexity
Poor fit
Under-fitting
Non-Parametric
Algorithms
Power
Flexible
Performance
Over-fitting
Trade-off between prediction
accuracy & model interpretability
38
Data is like people –
Interrogate it hard enough and it
will tell you whatever you want
to hear
Just like that
39
Just like that
The machine learning framework
y = f (x)
Training
Given a training set of labeled examples {(x1,y1), …, (xN, yN)},
estimate the prediction function f by minimizing the prediction error on the
training set
Testing
Apply f to a never before seen test example x and output the
predicted value y = f(x)
output
prediction
function
Input
Training
Labels
Training
Images
Training
Image
Features
Learned
model
The ML framework - Training
Classification
Example
The ML framework - Testing
Prediction-
Apple
Image
Features
Test Image
Learned
model
Prediction -
Orange
Image
Features
Test Image
Learned
model
Prediction-
Cherry
Image
Features
Test Image
Learned
model
How machine learning algorithms works?
Model Representation
Model Evaluation
Model Improvement
44
Machine Learning Tools/Libraries
R – e1071, randomForest, caret, glmnet, nnet, tree, RWeka
Python – Skitlearn, PyBrain, NumPy, SciPy, matplotlib, pandas
Hadoop – Mahout, Spark, Weka
RapidMiner
Orange
KNIME
H2O
Weka
Java Machine Learning Library
ML in Practice
 Understanding domain, prior knowledge, and goals
 Data integration, selection, cleaning, pre-processing, etc.
 Learning models
 Interpreting results
 Consolidating and deploying discovered knowledge
 Loop
References
46
 Elements of Statistical Learning by Hastie, Tibshirani,
Friedman
 Foundations of Machine Learning by Mehryar
Mohri, Afshin Rostamizadeh and Ameet Talwalkar
 Pattern Recognition and Machine Learning by
Bishop
 Machine Learning with R by Brett Lantz's
 Machine Learning for Hackers by Conway & White
References
47
http://machinelearningmastery.com/
https://www.analyticsvidhya.com/
http://www.analyticbridge.com/
http://www.datasciencecentral.com/
https://www.kaggle.com/
http://stats.stackexchange.com
http://datascience.stackexchange.com/
https://www.researchgate.net
https://www.quora.com
https://github.com/
48
For your queries
http://stats.stackexchange.com/users/79100/nisha-arora
https://www.researchgate.net/profile/Nisha_Arora2/contributions
https://www.quora.com/profile/Nisha-Arora-9
http://learnerworld.tumblr.com/
Thank You

1 machine learning demystified