Machine Learning
Lunch & Learn - Session 4
Luis Borbon
11/07/2017
Table of contents
1. Recap
2. Generalization in Machine Learning
3. Overfitting and Underfitting
4. Algorithms by Similarity
5. Real Application
6. People to follow
Recap
Recap
● Training, validation and test data sets.
● Learning Style
○ Supervised
○ Unsupervised
○ Semi-Supervised Learning.
● Similarity
○ Regression Algorithms
○ Instance-based Algorithms
○ Regularization Algorithms
○ Decision Tree Algorithms
Recap
Decision trees
Possible applications in PlantMiner:
For a searcher: Based on previous quotes,
identify an item that usually is being hired along
other.
● Suggest the item.
● Offer a discount to add the suggested
item.
For a supplier: Identify suppliers that would
crunch on the next subscription renewal.
Generalization in Machine Learning
Induction and deduction
Induction refers to learning general concepts
from specific examples which is exactly the
problem that supervised machine learning
problems aim to solve.
This is different from deduction that is the other
way around and seeks to learn specific concepts
from general rules.
Induction and deduction
The goal of a good machine learning model is to
generalize well from the training data to any data
from the problem domain.
This allows us to make predictions in the future
on data the model has never seen.
Overfitting and Underfitting
Overfitting
In machine learning, one of the most common
tasks is to fit a "model" to a set of training data,
so as to be able to make reliable predictions on
general untrained data.
In overfitting, a statistical model describes
random error or noise instead of the underlying
relationship.
The green line represents an overfitted model and the black line
represents a regularised model. While the green line best follows
the training data, it is too dependent on it and it is likely to have a
higher error rate on new unseen data, compared to the black
line.
Overfitting
A model that has been overfit has poor
predictive performance, as it overreacts to minor
fluctuations in the training data.
Noisy (roughly linear) data is fitted to both linear and polynomial
functions. Although the polynomial function is a perfect fit, the
linear version can be expected to generalize better. In other
words, if the two functions were used to extrapolate the data
beyond the fit data, the linear function would make better
predictions.
Overfitting
Overfitting occurs when a model is excessively
complex, such as having too many parameters
relative to the number of observations.
Overfitting/overtraining in supervised learning (e.g., neural
network). Training error is shown in blue, validation error in red,
both as a function of the number of training cycles. If the
validation error increases(positive slope) while the training error
steadily decreases(negative slope) then a situation of overfitting
may have occurred. The best predictive and fitted model would
be where the validation error has its global minimum.
Underfitting
Underfitting occurs when a statistical model or machine
learning algorithm cannot capture the underlying trend of
the data.
It occurs when the model or algorithm does not fit the data
enough. Underfitting occurs if the model or algorithm
shows low variance but high bias (to contrast the opposite,
overfitting from high variance and low bias). It is often a
result of an excessively simple model.
Underfitting would occur, for example, when fitting a linear
model to non-linear data.
Such a model would have poor predictive performance.
Underfitting
There are two important techniques that you can use
when evaluating machine learning algorithms to limit
overfitting:
● Use a resampling technique to estimate model
accuracy.
● Hold back a validation dataset.
Underfitting
Resampling
The most popular resampling technique is k-fold cross validation. It allows you to train and test your
model k-times on different subsets of training data and build up an estimate of the performance of a
machine learning model on unseen data.
Validation dataset
A validation dataset is simply a subset of your training data that you hold back from your machine
learning algorithms until the very end of your project. After you have selected and tuned your
machine learning algorithms on your training dataset you can evaluate the learned models on the
validation dataset to get a final objective idea of how the models might perform on unseen data.
Algorithms by Similarity (cont…)
Bayesian Algorithms
Bayesian methods are those that explicitly apply
Bayes’ Theorem for problems such as
classification and regression.
With appropriate pre-processing, it is competitive
in this domain with more advanced methods
including support vector machines.
It also finds application in automatic medical
diagnosis.
Document classification, based on word
frequencies. e.g. SPAM.
Bayesian Algorithms
The most popular Bayesian algorithms are:
● Naive Bayes
● Gaussian Naive Bayes
● Multinomial Naive Bayes
● Averaged One-Dependence Estimators
(AODE)
● Bayesian Belief Network (BBN)
● Bayesian Network (BN)
Real Application
DoseMe.com.au
Bayesian dosing uses patient data and
laboratory results to estimate a patient's ability to
absorb, process, and clear a drug from their
system. Using a published population model,
DoseMe's algorithms adjusts the
pharmacokinetic and/or pharmacodynamic
parameters so that a patient-specific,
individualised drug model is built. This individual
model is then used to provide a patient-specific
dosing recommendation to reach a therapeutic
target.
People to Follow
Fei-Fei Li
Fei-Fei Li, who publishes under the name Li Fei-Fei, is an
Associate Professor of Computer Science at Stanford
University. She is the director of the Stanford Artificial
Intelligence Lab and the Stanford Vision Lab.
● Born: 1976, Beijing, China
● Spouse: Silvio Savarese
● Education: California Institute of Technology (2005)
● Residence: United States of America
● Books: Computer Vision: From 3D Reconstruction to
Visual Recognition, more
● Doctoral advisors: Pietro Perona, Christof Koch
● http://vision.stanford.edu/feifeili/
● @drfeifei
Andrej Karpathy
Director of AI at Tesla, currently focused on perception for the
Autopilot.
Previously, I was a Research Scientist at OpenAI working on
Deep Learning in Computer Vision, Generative Modeling and
Reinforcement Learning.
PhD from Stanford, where I worked with Fei-Fei Li on
Convolutional/Recurrent Neural Network architectures and
their applications in Computer Vision, Natural Language
Processing and their intersection.
● http://cs.stanford.edu/people/karpathy/
● @karpathy
OpenAI Gym
Founded: December 11, 2015
Founders: Elon Musk, Sam Altman, and others
Type: 501(c)(3) Nonprofit organization
Location: San Francisco, California, USA
Products: OpenAI Gym
Mission: Friendly artificial intelligence
● https://www.openai.com/
● @OpenAI

Machine learning - session 4

  • 1.
    Machine Learning Lunch &Learn - Session 4 Luis Borbon 11/07/2017
  • 2.
    Table of contents 1.Recap 2. Generalization in Machine Learning 3. Overfitting and Underfitting 4. Algorithms by Similarity 5. Real Application 6. People to follow
  • 3.
  • 4.
    Recap ● Training, validationand test data sets. ● Learning Style ○ Supervised ○ Unsupervised ○ Semi-Supervised Learning. ● Similarity ○ Regression Algorithms ○ Instance-based Algorithms ○ Regularization Algorithms ○ Decision Tree Algorithms
  • 6.
    Recap Decision trees Possible applicationsin PlantMiner: For a searcher: Based on previous quotes, identify an item that usually is being hired along other. ● Suggest the item. ● Offer a discount to add the suggested item. For a supplier: Identify suppliers that would crunch on the next subscription renewal.
  • 7.
  • 8.
    Induction and deduction Inductionrefers to learning general concepts from specific examples which is exactly the problem that supervised machine learning problems aim to solve. This is different from deduction that is the other way around and seeks to learn specific concepts from general rules.
  • 9.
    Induction and deduction Thegoal of a good machine learning model is to generalize well from the training data to any data from the problem domain. This allows us to make predictions in the future on data the model has never seen.
  • 10.
  • 11.
    Overfitting In machine learning,one of the most common tasks is to fit a "model" to a set of training data, so as to be able to make reliable predictions on general untrained data. In overfitting, a statistical model describes random error or noise instead of the underlying relationship. The green line represents an overfitted model and the black line represents a regularised model. While the green line best follows the training data, it is too dependent on it and it is likely to have a higher error rate on new unseen data, compared to the black line.
  • 12.
    Overfitting A model thathas been overfit has poor predictive performance, as it overreacts to minor fluctuations in the training data. Noisy (roughly linear) data is fitted to both linear and polynomial functions. Although the polynomial function is a perfect fit, the linear version can be expected to generalize better. In other words, if the two functions were used to extrapolate the data beyond the fit data, the linear function would make better predictions.
  • 13.
    Overfitting Overfitting occurs whena model is excessively complex, such as having too many parameters relative to the number of observations. Overfitting/overtraining in supervised learning (e.g., neural network). Training error is shown in blue, validation error in red, both as a function of the number of training cycles. If the validation error increases(positive slope) while the training error steadily decreases(negative slope) then a situation of overfitting may have occurred. The best predictive and fitted model would be where the validation error has its global minimum.
  • 14.
    Underfitting Underfitting occurs whena statistical model or machine learning algorithm cannot capture the underlying trend of the data. It occurs when the model or algorithm does not fit the data enough. Underfitting occurs if the model or algorithm shows low variance but high bias (to contrast the opposite, overfitting from high variance and low bias). It is often a result of an excessively simple model. Underfitting would occur, for example, when fitting a linear model to non-linear data. Such a model would have poor predictive performance.
  • 15.
    Underfitting There are twoimportant techniques that you can use when evaluating machine learning algorithms to limit overfitting: ● Use a resampling technique to estimate model accuracy. ● Hold back a validation dataset.
  • 16.
    Underfitting Resampling The most popularresampling technique is k-fold cross validation. It allows you to train and test your model k-times on different subsets of training data and build up an estimate of the performance of a machine learning model on unseen data. Validation dataset A validation dataset is simply a subset of your training data that you hold back from your machine learning algorithms until the very end of your project. After you have selected and tuned your machine learning algorithms on your training dataset you can evaluate the learned models on the validation dataset to get a final objective idea of how the models might perform on unseen data.
  • 17.
  • 18.
    Bayesian Algorithms Bayesian methodsare those that explicitly apply Bayes’ Theorem for problems such as classification and regression. With appropriate pre-processing, it is competitive in this domain with more advanced methods including support vector machines. It also finds application in automatic medical diagnosis. Document classification, based on word frequencies. e.g. SPAM.
  • 19.
    Bayesian Algorithms The mostpopular Bayesian algorithms are: ● Naive Bayes ● Gaussian Naive Bayes ● Multinomial Naive Bayes ● Averaged One-Dependence Estimators (AODE) ● Bayesian Belief Network (BBN) ● Bayesian Network (BN)
  • 20.
  • 21.
    DoseMe.com.au Bayesian dosing usespatient data and laboratory results to estimate a patient's ability to absorb, process, and clear a drug from their system. Using a published population model, DoseMe's algorithms adjusts the pharmacokinetic and/or pharmacodynamic parameters so that a patient-specific, individualised drug model is built. This individual model is then used to provide a patient-specific dosing recommendation to reach a therapeutic target.
  • 22.
  • 23.
    Fei-Fei Li Fei-Fei Li,who publishes under the name Li Fei-Fei, is an Associate Professor of Computer Science at Stanford University. She is the director of the Stanford Artificial Intelligence Lab and the Stanford Vision Lab. ● Born: 1976, Beijing, China ● Spouse: Silvio Savarese ● Education: California Institute of Technology (2005) ● Residence: United States of America ● Books: Computer Vision: From 3D Reconstruction to Visual Recognition, more ● Doctoral advisors: Pietro Perona, Christof Koch ● http://vision.stanford.edu/feifeili/ ● @drfeifei
  • 24.
    Andrej Karpathy Director ofAI at Tesla, currently focused on perception for the Autopilot. Previously, I was a Research Scientist at OpenAI working on Deep Learning in Computer Vision, Generative Modeling and Reinforcement Learning. PhD from Stanford, where I worked with Fei-Fei Li on Convolutional/Recurrent Neural Network architectures and their applications in Computer Vision, Natural Language Processing and their intersection. ● http://cs.stanford.edu/people/karpathy/ ● @karpathy
  • 25.
    OpenAI Gym Founded: December11, 2015 Founders: Elon Musk, Sam Altman, and others Type: 501(c)(3) Nonprofit organization Location: San Francisco, California, USA Products: OpenAI Gym Mission: Friendly artificial intelligence ● https://www.openai.com/ ● @OpenAI