SlideShare a Scribd company logo
1 of 17
Download to read offline
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Machine Learning for Deep Learning
Sung-Yub Kim
Dept of IE, Seoul National University
January 14, 2017
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Definition of Machine Learning Algorithm
Experience E
Task T
Performance Measure P
Definition
Machine Learning Algorithm
”A computer program is said to learn from experience E with respect to some
class of tasks T and performance measure P if its performance at tasks in T,
as measured by P, improves with experience E.”
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Definition of Machine Learning Algorithm
Experience E
Task T
Performance Measure P
Most of learning algorithms experience data
Supervised Learning
Each data contains feature, and each data is also associated with label or
target
ex) How much the used car is? Whether the price of asset go up?
Unsupervised Learning
Each data contains features, machine is asked to learn useful properties of
the strucure of this dataset. The output of learning algorithm is usually
distribution of data whether explicitly or implicitly.
ex) Density estimation, Synthesis, Denoising
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Definition of Machine Learning Algorithm
Experience E
Task T
Performance Measure P
These two concepts are not formal
ML problem of one section can be transformed to the other section by
p(x) =
n
i=2
p(x1)p(xi |x1, . . . , xi−1) (1)
and
p(y|x) =
p(x, y)
y p(x, y )
(2)
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Definition of Machine Learning Algorithm
Experience E
Task T
Performance Measure P
And other learning pearadigms can be possible.
semi-supervised
Basically supervised, but have partially filled label.
ex) Infer blanks, Recover corrupted data
Reinforcement Learning
Interact with environment, more formally RL is solving concepts of
Partially informed Markov Decision Process.
ex) AlphaGo
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Definition of Machine Learning Algorithm
Experience E
Task T
Performance Measure P
Design Matrix
In design matrix, examples are saved in rows, and features are saved in columns.
To make design matrix, you need to make your data vector in same dimension.
For generality, sometimes we need to describe our data by a set of examples.
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Definition of Machine Learning Algorithm
Experience E
Task T
Performance Measure P
Machine Learning tasks are usually described in terms of how the machine
learning system should process an example.
An example is a collection of features that have been quantitatively measured
from some object or event.
Example can be represented as x ∈ Rn
where each entry xi of the vector is
feature.
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Definition of Machine Learning Algorithm
Experience E
Task T
Performance Measure P
Exmples of T
Classification
Put examples and select one class between 1,. . . ,k. More formally, we
need to make function f
f : Rn
→ 1, . . . , k (3)
Regression
Put examples and select one real value, More formally, we need to make
function f
f : Rn
→ R (4)
Fill the missing inputs
If some inputs are corrupted or removed, we need to choose whether we
will make a model to fill all blanks or just estimate one probability
distribution. First one is an Supervised Learning, and the second one is an
Unsupervised Learning.
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Definition of Machine Learning Algorithm
Experience E
Task T
Performance Measure P
To evaluate the abilities of a ML algorithm, we must design a quantitative
measure of its performance.
If T is a classification problem or regression problem, then we need to
define error rate which will be minimized.
ex) 0-1 loss for classification, sum of squared error for regression
For density estimation problem, since we cannot use error rate since there
is no correct label, we need to define cross entropy Ep[− log q] which will
be maximized.
Since we are interested in how well the ML algorithm perform on data that it
has not seen before, we evaluate performance measure using a test data which
is separated before train algorithms.
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Types of Error
Underfitting and Overfitting
No Free Lunch Theorem
Regularization
ML algorithms need to perform well in new, previously unseen inputs not just
those on model which we trained. We call this ability to perform well on
unobserved inputs in generalization.
Training Error
When training a ML model, we have access to a training set; we can
compute some error measure on the training set, called the training error;
and we reduce this training error. In short, when training, we just solve
minimization problem whose objective function is training error.
Generalization Error
When training process is over, we need to evaluate a ML model based on
test set, and the measure of this evalutaion is called Generalization Error,
also called test error.
In short, we need to minimize Epdata [error], while we actually minimize in
training. Eptrain [error] (In this context, ptrain is empirical distribution)
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Types of Error
Underfitting and Overfitting
No Free Lunch Theorem
Regularization
If we assume that training set and test set are independent and idendically
distributed(i.i.d assumption) and a model is independent from those two data
sets, we can say that
Epdata [error] = Eptrain [error] = Eptest [error] (5)
However, since we train model using train data, model is not independent.
Therefore, we get
Eptrain [error] ≤ Eptest [error] (6)
The factors determining how well a ML algorithm will perform are its ability to
1 Make the training error small(If not underfitting occurs)
2 Make the gap between training and test error small(If not overfitting
occurs)
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Types of Error
Underfitting and Overfitting
No Free Lunch Theorem
Regularization
A model’s capacity is its ability to fit a wide variety of functions. If capacity is
too low, then hard to fit. If too high, then overfit problem may occurs.
Control of Capacity
1 Choosing Hypothesis Spcae : Control a family of functions
2 Representational Capcity : Control range of the parameters
VC-dimenstion
VC-dimenstion measures the capcity of a binary classifier.
The most important theorem in Statitstical Learning
P[test error ≤ training error +
1
N
[D(log(
2N
D
+ 1) − log(
η
4
))]] = 1 − η
(7)
where D is VC-dimension of model, N is the number of data Therefore, we
get sample-complexity bounds like
N = Θ(
D + ln 1
δ
) (8)
But this complexity cannot be adopted to deep learning (TOO BIG)
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Types of Error
Underfitting and Overfitting
No Free Lunch Theorem
Regularization
Bayes Error
The error incurred by an oracle making prediction from the true distribution
p(x, y) is called the Bayes error.
Relationship between Capcity and Error
If we have enough capacity and data, then by VC-dimenstion
Eptrain [error] Eptest [error] and generalization error converges to bayes error.
But if we don’t have enough capacity, these phenomena would not occur.
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Types of Error
Underfitting and Overfitting
No Free Lunch Theorem
Regularization
No Free Lunch Theorem for ML
”Averaged over all possible data-generating distributions, every classification
algorithm has the same error rate when classifying previously unobserved
points.”
This theorem argue that no ML algorithm is universally any better than any
other.
Therefore, the goal of ML is not to seek a universal learning algorithm or the
absolute learning algorithm. Instead, our goal is to understand what kinds of
distributions are relevant to the ’Real World’, and what kinds of ML algorithms
perform well on data drawn from the kinds of data-generating distributions we
care about.
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Types of Error
Underfitting and Overfitting
No Free Lunch Theorem
Regularization
Instead we explicitly remove some set of functions, we can get model which is
more generalized by adding regularizer like Ω(w) = w w.
This type of regularization can be occured, since Lagrangian of this
optimization problem
min
w
{Eptrain [error] : w ≤ α} (9)
is
Eptrain [error] + λ w 2
(10)
and the difference between solving original problem and minimizing Lagrangian
is just we don’t know which λ is the best exactly.
Also there exist many other regularization technique which can be used
implicitly.
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Hyperparatmers
Cross-Validation
Hyper-parameter
Hyper-parameter is a parameter that is given when ML model is learning.
Sometimes we can optimize or learn it, but most of time ir is not appropriate to
learn that hyper-parameter on the training set. This applies to all
hyper-parameters that control model capacity.
Validation Set
To solve above problem, we need a validation set of example that the training
algorithm does note observe. Also, examples in validation set cannot be
included in test set also, since this optimizing hyper-parameter process is also
part of learning. In short, if we optimize a hyper-parameter we train model
twice. The ratio of train data to validation data is 8:2 usually.
Sung-Yub Kim Machine Learning for Deep Learning
Learning Algorithms
Capacity, Overfitting and Underfitting
Hyperparameters and Validations Sets
Hyperparatmers
Cross-Validation
K-fold Cross Validation
If you don’t have enough data, previous dividing data proocess would be critical
to performance of model. In that case, you need to average your model
evaluation by repeating devision process of train-test. That means you need to
divide data for k partitions and you need to train k times using (k-1) partitions
of data and test using the remained partition. Then average your model
performance to get a estimation of real model performance. One problem is
that no unbiased estimators of the variance of such average error estimators
extst.
Sung-Yub Kim Machine Learning for Deep Learning

More Related Content

What's hot

Machine Learning Unit 2 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 2 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 2 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 2 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 
Machine Learning Interview Questions
Machine Learning Interview QuestionsMachine Learning Interview Questions
Machine Learning Interview QuestionsRock Interview
 
Machine learning interview questions and answers
Machine learning interview questions and answersMachine learning interview questions and answers
Machine learning interview questions and answerskavinilavuG
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced dataSaurabhWani6
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
 
Essentials of machine learning algorithms
Essentials of machine learning algorithmsEssentials of machine learning algorithms
Essentials of machine learning algorithmsArunangsu Sahu
 
Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
 
Evaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationEvaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationSridhar Nomula
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Simplilearn
 
Class imbalance problem1
Class imbalance problem1Class imbalance problem1
Class imbalance problem1chs71
 
Summary statistics
Summary statisticsSummary statistics
Summary statisticsRupak Roy
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.ASHOK KUMAR
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning BasicsSuresh Arora
 
Handling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random UndersamplingHandling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random UndersamplingIRJET Journal
 
Intepretable Machine Learning
Intepretable Machine LearningIntepretable Machine Learning
Intepretable Machine LearningAnkit Tewari
 

What's hot (18)

Machine Learning Unit 2 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 2 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 2 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 2 Semester 3 MSc IT Part 2 Mumbai University
 
Machine Learning Interview Questions
Machine Learning Interview QuestionsMachine Learning Interview Questions
Machine Learning Interview Questions
 
Machine learning interview questions and answers
Machine learning interview questions and answersMachine learning interview questions and answers
Machine learning interview questions and answers
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced data
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
 
Essentials of machine learning algorithms
Essentials of machine learning algorithmsEssentials of machine learning algorithms
Essentials of machine learning algorithms
 
Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Evaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationEvaluation of multilabel multi class classification
Evaluation of multilabel multi class classification
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
 
AI Algorithms
AI AlgorithmsAI Algorithms
AI Algorithms
 
Class imbalance problem1
Class imbalance problem1Class imbalance problem1
Class imbalance problem1
 
Summary statistics
Summary statisticsSummary statistics
Summary statistics
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Handling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random UndersamplingHandling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random Undersampling
 
Intepretable Machine Learning
Intepretable Machine LearningIntepretable Machine Learning
Intepretable Machine Learning
 

Viewers also liked

J&J Thesis Presentation July 2016
J&J Thesis Presentation July 2016J&J Thesis Presentation July 2016
J&J Thesis Presentation July 2016Michalis Avgoulis
 
Learning analytics and evidence-based teaching and learning
Learning analytics and evidence-based teaching and learningLearning analytics and evidence-based teaching and learning
Learning analytics and evidence-based teaching and learningDoug Clow
 
100304 1-a-guia-para-elaborar-un-plan-estrategico-6903299279715671604
100304 1-a-guia-para-elaborar-un-plan-estrategico-6903299279715671604100304 1-a-guia-para-elaborar-un-plan-estrategico-6903299279715671604
100304 1-a-guia-para-elaborar-un-plan-estrategico-6903299279715671604Warman Malio Mateo
 
Cmoconstruirunamatrizdofa 120916105134-phpapp02
Cmoconstruirunamatrizdofa 120916105134-phpapp02Cmoconstruirunamatrizdofa 120916105134-phpapp02
Cmoconstruirunamatrizdofa 120916105134-phpapp02Warman Malio Mateo
 
A CLIL Unit: The Middle Ages
A CLIL Unit: The Middle AgesA CLIL Unit: The Middle Ages
A CLIL Unit: The Middle AgesPauhistoria
 
SlidesA Comparison of GPU Execution Time Prediction using Machine Learning an...
SlidesA Comparison of GPU Execution Time Prediction using Machine Learning an...SlidesA Comparison of GPU Execution Time Prediction using Machine Learning an...
SlidesA Comparison of GPU Execution Time Prediction using Machine Learning an...Marcos Gonzalez
 
thesis_jinxing_lin
thesis_jinxing_linthesis_jinxing_lin
thesis_jinxing_linjinxing lin
 
Junli Gu at AI Frontiers: Autonomous Driving Revolution
Junli Gu at AI Frontiers: Autonomous Driving RevolutionJunli Gu at AI Frontiers: Autonomous Driving Revolution
Junli Gu at AI Frontiers: Autonomous Driving RevolutionAI Frontiers
 
E-Books: The New Business of Writing for E-Reading
E-Books: The New Business of Writing for E-ReadingE-Books: The New Business of Writing for E-Reading
E-Books: The New Business of Writing for E-ReadingJackKHayward
 
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...Kai Wähner
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning台灣資料科學年會
 
Manufacturing Execution System
Manufacturing Execution SystemManufacturing Execution System
Manufacturing Execution SystemAnand Subramaniam
 

Viewers also liked (20)

Presentación3
Presentación3Presentación3
Presentación3
 
J&J Thesis Presentation July 2016
J&J Thesis Presentation July 2016J&J Thesis Presentation July 2016
J&J Thesis Presentation July 2016
 
Learning analytics and evidence-based teaching and learning
Learning analytics and evidence-based teaching and learningLearning analytics and evidence-based teaching and learning
Learning analytics and evidence-based teaching and learning
 
100304 1-a-guia-para-elaborar-un-plan-estrategico-6903299279715671604
100304 1-a-guia-para-elaborar-un-plan-estrategico-6903299279715671604100304 1-a-guia-para-elaborar-un-plan-estrategico-6903299279715671604
100304 1-a-guia-para-elaborar-un-plan-estrategico-6903299279715671604
 
Presentation Teaching Evidence-Based Management NYU Wagner 2014
Presentation Teaching Evidence-Based Management NYU Wagner 2014Presentation Teaching Evidence-Based Management NYU Wagner 2014
Presentation Teaching Evidence-Based Management NYU Wagner 2014
 
Cmoconstruirunamatrizdofa 120916105134-phpapp02
Cmoconstruirunamatrizdofa 120916105134-phpapp02Cmoconstruirunamatrizdofa 120916105134-phpapp02
Cmoconstruirunamatrizdofa 120916105134-phpapp02
 
A CLIL Unit: The Middle Ages
A CLIL Unit: The Middle AgesA CLIL Unit: The Middle Ages
A CLIL Unit: The Middle Ages
 
de goede strijd
de goede strijdde goede strijd
de goede strijd
 
SlidesA Comparison of GPU Execution Time Prediction using Machine Learning an...
SlidesA Comparison of GPU Execution Time Prediction using Machine Learning an...SlidesA Comparison of GPU Execution Time Prediction using Machine Learning an...
SlidesA Comparison of GPU Execution Time Prediction using Machine Learning an...
 
thesis_jinxing_lin
thesis_jinxing_linthesis_jinxing_lin
thesis_jinxing_lin
 
Plant Integration and MES Solution for Industry
Plant Integration and MES Solution for IndustryPlant Integration and MES Solution for Industry
Plant Integration and MES Solution for Industry
 
Junli Gu at AI Frontiers: Autonomous Driving Revolution
Junli Gu at AI Frontiers: Autonomous Driving RevolutionJunli Gu at AI Frontiers: Autonomous Driving Revolution
Junli Gu at AI Frontiers: Autonomous Driving Revolution
 
Ponte cabalar
Ponte cabalarPonte cabalar
Ponte cabalar
 
Catedrais
CatedraisCatedrais
Catedrais
 
E-Books: The New Business of Writing for E-Reading
E-Books: The New Business of Writing for E-ReadingE-Books: The New Business of Writing for E-Reading
E-Books: The New Business of Writing for E-Reading
 
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
Machine Learning Applied to Real Time Scoring in Manufacturing and Energy Uti...
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
 
Manufacturing Execution System
Manufacturing Execution SystemManufacturing Execution System
Manufacturing Execution System
 

Similar to Machine learning for deep learning

Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA BoostAman Patel
 
Machine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxMachine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxNsitTech
 
Machine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdfMachine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdfNsitTech
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeOsama Ghandour Geris
 
Getting started with Machine Learning
Getting started with Machine LearningGetting started with Machine Learning
Getting started with Machine LearningGaurav Bhalotia
 
introductiontomachinelearning.pptx
introductiontomachinelearning.pptxintroductiontomachinelearning.pptx
introductiontomachinelearning.pptxSivapriyaS12
 
The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxRuby Shrestha
 
MachineLlearning introduction
MachineLlearning introductionMachineLlearning introduction
MachineLlearning introductionThe IOT Academy
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptxNaveenkushwaha18
 
Statistical foundations of ml
Statistical foundations of mlStatistical foundations of ml
Statistical foundations of mlVipul Kalamkar
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Hayim Makabee
 
Machine-Learning-Overview a statistical approach
Machine-Learning-Overview a statistical approachMachine-Learning-Overview a statistical approach
Machine-Learning-Overview a statistical approachAjit Ghodke
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxajondaree
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .pptbutest
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningSangath babu
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersSatyam Jaiswal
 

Similar to Machine learning for deep learning (20)

Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA Boost
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 
Machine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxMachine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptx
 
ML_lec1.pdf
ML_lec1.pdfML_lec1.pdf
ML_lec1.pdf
 
Machine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdfMachine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdf
 
Lec1 intoduction.pptx
Lec1 intoduction.pptxLec1 intoduction.pptx
Lec1 intoduction.pptx
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
 
Getting started with Machine Learning
Getting started with Machine LearningGetting started with Machine Learning
Getting started with Machine Learning
 
introductiontomachinelearning.pptx
introductiontomachinelearning.pptxintroductiontomachinelearning.pptx
introductiontomachinelearning.pptx
 
The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptx
 
MachineLlearning introduction
MachineLlearning introductionMachineLlearning introduction
MachineLlearning introduction
 
Introduction to ml
Introduction to mlIntroduction to ml
Introduction to ml
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptx
 
Statistical foundations of ml
Statistical foundations of mlStatistical foundations of ml
Statistical foundations of ml
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
 
Machine-Learning-Overview a statistical approach
Machine-Learning-Overview a statistical approachMachine-Learning-Overview a statistical approach
Machine-Learning-Overview a statistical approach
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptx
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
 

Recently uploaded

VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Recently uploaded (20)

VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Machine learning for deep learning

  • 1. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Machine Learning for Deep Learning Sung-Yub Kim Dept of IE, Seoul National University January 14, 2017 Sung-Yub Kim Machine Learning for Deep Learning
  • 2. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Definition of Machine Learning Algorithm Experience E Task T Performance Measure P Definition Machine Learning Algorithm ”A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.” Sung-Yub Kim Machine Learning for Deep Learning
  • 3. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Definition of Machine Learning Algorithm Experience E Task T Performance Measure P Most of learning algorithms experience data Supervised Learning Each data contains feature, and each data is also associated with label or target ex) How much the used car is? Whether the price of asset go up? Unsupervised Learning Each data contains features, machine is asked to learn useful properties of the strucure of this dataset. The output of learning algorithm is usually distribution of data whether explicitly or implicitly. ex) Density estimation, Synthesis, Denoising Sung-Yub Kim Machine Learning for Deep Learning
  • 4. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Definition of Machine Learning Algorithm Experience E Task T Performance Measure P These two concepts are not formal ML problem of one section can be transformed to the other section by p(x) = n i=2 p(x1)p(xi |x1, . . . , xi−1) (1) and p(y|x) = p(x, y) y p(x, y ) (2) Sung-Yub Kim Machine Learning for Deep Learning
  • 5. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Definition of Machine Learning Algorithm Experience E Task T Performance Measure P And other learning pearadigms can be possible. semi-supervised Basically supervised, but have partially filled label. ex) Infer blanks, Recover corrupted data Reinforcement Learning Interact with environment, more formally RL is solving concepts of Partially informed Markov Decision Process. ex) AlphaGo Sung-Yub Kim Machine Learning for Deep Learning
  • 6. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Definition of Machine Learning Algorithm Experience E Task T Performance Measure P Design Matrix In design matrix, examples are saved in rows, and features are saved in columns. To make design matrix, you need to make your data vector in same dimension. For generality, sometimes we need to describe our data by a set of examples. Sung-Yub Kim Machine Learning for Deep Learning
  • 7. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Definition of Machine Learning Algorithm Experience E Task T Performance Measure P Machine Learning tasks are usually described in terms of how the machine learning system should process an example. An example is a collection of features that have been quantitatively measured from some object or event. Example can be represented as x ∈ Rn where each entry xi of the vector is feature. Sung-Yub Kim Machine Learning for Deep Learning
  • 8. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Definition of Machine Learning Algorithm Experience E Task T Performance Measure P Exmples of T Classification Put examples and select one class between 1,. . . ,k. More formally, we need to make function f f : Rn → 1, . . . , k (3) Regression Put examples and select one real value, More formally, we need to make function f f : Rn → R (4) Fill the missing inputs If some inputs are corrupted or removed, we need to choose whether we will make a model to fill all blanks or just estimate one probability distribution. First one is an Supervised Learning, and the second one is an Unsupervised Learning. Sung-Yub Kim Machine Learning for Deep Learning
  • 9. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Definition of Machine Learning Algorithm Experience E Task T Performance Measure P To evaluate the abilities of a ML algorithm, we must design a quantitative measure of its performance. If T is a classification problem or regression problem, then we need to define error rate which will be minimized. ex) 0-1 loss for classification, sum of squared error for regression For density estimation problem, since we cannot use error rate since there is no correct label, we need to define cross entropy Ep[− log q] which will be maximized. Since we are interested in how well the ML algorithm perform on data that it has not seen before, we evaluate performance measure using a test data which is separated before train algorithms. Sung-Yub Kim Machine Learning for Deep Learning
  • 10. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Types of Error Underfitting and Overfitting No Free Lunch Theorem Regularization ML algorithms need to perform well in new, previously unseen inputs not just those on model which we trained. We call this ability to perform well on unobserved inputs in generalization. Training Error When training a ML model, we have access to a training set; we can compute some error measure on the training set, called the training error; and we reduce this training error. In short, when training, we just solve minimization problem whose objective function is training error. Generalization Error When training process is over, we need to evaluate a ML model based on test set, and the measure of this evalutaion is called Generalization Error, also called test error. In short, we need to minimize Epdata [error], while we actually minimize in training. Eptrain [error] (In this context, ptrain is empirical distribution) Sung-Yub Kim Machine Learning for Deep Learning
  • 11. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Types of Error Underfitting and Overfitting No Free Lunch Theorem Regularization If we assume that training set and test set are independent and idendically distributed(i.i.d assumption) and a model is independent from those two data sets, we can say that Epdata [error] = Eptrain [error] = Eptest [error] (5) However, since we train model using train data, model is not independent. Therefore, we get Eptrain [error] ≤ Eptest [error] (6) The factors determining how well a ML algorithm will perform are its ability to 1 Make the training error small(If not underfitting occurs) 2 Make the gap between training and test error small(If not overfitting occurs) Sung-Yub Kim Machine Learning for Deep Learning
  • 12. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Types of Error Underfitting and Overfitting No Free Lunch Theorem Regularization A model’s capacity is its ability to fit a wide variety of functions. If capacity is too low, then hard to fit. If too high, then overfit problem may occurs. Control of Capacity 1 Choosing Hypothesis Spcae : Control a family of functions 2 Representational Capcity : Control range of the parameters VC-dimenstion VC-dimenstion measures the capcity of a binary classifier. The most important theorem in Statitstical Learning P[test error ≤ training error + 1 N [D(log( 2N D + 1) − log( η 4 ))]] = 1 − η (7) where D is VC-dimension of model, N is the number of data Therefore, we get sample-complexity bounds like N = Θ( D + ln 1 δ ) (8) But this complexity cannot be adopted to deep learning (TOO BIG) Sung-Yub Kim Machine Learning for Deep Learning
  • 13. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Types of Error Underfitting and Overfitting No Free Lunch Theorem Regularization Bayes Error The error incurred by an oracle making prediction from the true distribution p(x, y) is called the Bayes error. Relationship between Capcity and Error If we have enough capacity and data, then by VC-dimenstion Eptrain [error] Eptest [error] and generalization error converges to bayes error. But if we don’t have enough capacity, these phenomena would not occur. Sung-Yub Kim Machine Learning for Deep Learning
  • 14. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Types of Error Underfitting and Overfitting No Free Lunch Theorem Regularization No Free Lunch Theorem for ML ”Averaged over all possible data-generating distributions, every classification algorithm has the same error rate when classifying previously unobserved points.” This theorem argue that no ML algorithm is universally any better than any other. Therefore, the goal of ML is not to seek a universal learning algorithm or the absolute learning algorithm. Instead, our goal is to understand what kinds of distributions are relevant to the ’Real World’, and what kinds of ML algorithms perform well on data drawn from the kinds of data-generating distributions we care about. Sung-Yub Kim Machine Learning for Deep Learning
  • 15. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Types of Error Underfitting and Overfitting No Free Lunch Theorem Regularization Instead we explicitly remove some set of functions, we can get model which is more generalized by adding regularizer like Ω(w) = w w. This type of regularization can be occured, since Lagrangian of this optimization problem min w {Eptrain [error] : w ≤ α} (9) is Eptrain [error] + λ w 2 (10) and the difference between solving original problem and minimizing Lagrangian is just we don’t know which λ is the best exactly. Also there exist many other regularization technique which can be used implicitly. Sung-Yub Kim Machine Learning for Deep Learning
  • 16. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Hyperparatmers Cross-Validation Hyper-parameter Hyper-parameter is a parameter that is given when ML model is learning. Sometimes we can optimize or learn it, but most of time ir is not appropriate to learn that hyper-parameter on the training set. This applies to all hyper-parameters that control model capacity. Validation Set To solve above problem, we need a validation set of example that the training algorithm does note observe. Also, examples in validation set cannot be included in test set also, since this optimizing hyper-parameter process is also part of learning. In short, if we optimize a hyper-parameter we train model twice. The ratio of train data to validation data is 8:2 usually. Sung-Yub Kim Machine Learning for Deep Learning
  • 17. Learning Algorithms Capacity, Overfitting and Underfitting Hyperparameters and Validations Sets Hyperparatmers Cross-Validation K-fold Cross Validation If you don’t have enough data, previous dividing data proocess would be critical to performance of model. In that case, you need to average your model evaluation by repeating devision process of train-test. That means you need to divide data for k partitions and you need to train k times using (k-1) partitions of data and test using the remained partition. Then average your model performance to get a estimation of real model performance. One problem is that no unbiased estimators of the variance of such average error estimators extst. Sung-Yub Kim Machine Learning for Deep Learning