기계학습(Machine learning) 입문하기

Terry Taewoong Um (terry.t.um@gmail.com)
University of Waterloo
Department of Electrical & Computer Engineering
Terry Taewoong Um
INTRODUCTION TO
MACHINE LEARNING
AND DEEP LEARNING
1
T-robotics.blogspot.com
Facebook.com/TRobotics

CAUTION
• I cannot explain everything
• You cannot get every details
2
• Try to get a big picture
• Get some useful keywords
• Connect with your research

CONTENTS
1. What is Machine Learning?
2. What is Deep Learning?
3

CONTENTS
4
1. What is Machine Learning?

WHAT IS MACHINE LEARNING?
"A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured
by P, improves with experience E“ – T. Michell (1997)
Example: A program for soccer tactics
5
T : Win the game
P : Goals
E : (x) Players’ movements
(y) Evaluation

WHAT IS MACHINE LEARNING?
6
“Toward learning robot table tennis”, J. Peters et al. (2012)
https://youtu.be/SH3bADiB7uQ

TASKS
7
classification
discrete target values
x : pixels (28*28)
y : 0,1, 2,3,…,9
regression
real target values
x ∈ (0,100)
y : 0,1, 2,3,…,9
clustering
no target values
x ∈ (-3,3)×(-3,3)

PERFORMANCE
8
classification
0-1 loss function
regression
L2 loss function
clustering

EXPERIENCE
9
classification
labeled data
(pixels)→(number)
regression
labeled data
(x) → (y)
clustering
unlabeled data
(x1,x2)

A TOY EXAMPLE
10
? Height(cm)
Weight
(kg)
[Input X]
[Output Y]

11
180 Height(cm)
Weight
(kg)
80
Y = aX+b
Model : Y = aX+b Parameter : (a, b)
[Goal] Find (a,b) which best fits the given data
A TOY EXAMPLE

12
[Analytic Solution]
Least square problem
(from AX = b, X=A#b where
A# is A’s pseudo inverse)
Not always available
[Numerical Solution]
1. Set a cost function
2. Apply an optimization method
(e.g. Gradient Descent (GD) Method)
L
(a,b)
http://www.yaldex.com/game-
development/1592730043_ch18lev1sec4.html
Local minima problem
http://mnemstudio.org/neural-networks-
multilayer-perceptron-design.htm
A TOY EXAMPLE

13
32 Age(year)
Running
Record
(min)
140
WHAT WOULD BE THE CORRECT MODEL?
Select a model → Set a cost function → Optimization

14
? X
Y
1. Regularization 2. Nonparametric model
“overfitting”

15
L2 REGULARIZATION
(e.g. w=(a,b) where Y=aX+b)
Avoid a complicated model!
• Another interpretation :
: Maximum a Posteriori (MAP)
http://goo.gl/6GE2ix
http://goo.gl/6GE2ix

16
1. Regularization 2. Nonparametric model
training time
error
training error
test error
we should
stop here
training
set
validation
set
test
set
for training
(parameter
optimization)
for early
stopping
(avoid
overfitting)
for evaluation
(measure the
performance)
keep watching the validation error

17
NONPARAMETRIC MODEL
• It does not assume any parametric models (e.g. Y = aX+b, Y=aX2+bX+c, etc.)
• It often requires much more samples
• Kernel methods are frequently applied for modeling the data
• Gaussian Process Regression (GPR), a sort of kernel method, is a widely-used
nonparametric regression method
• Support Vector Machine (SVM), also a sort of kernel method, is a widely-used
nonparametric classification method
kernel function
[Input space] [Feature space]

18
SUPPORT VECTOR MACHINE (SVM)
“Myo”, Thalmic Labs (2013)
https://youtu.be/oWu9TFJjHaM
[Linear classifiers] [Maximum margin]
Support vector Machine Tutorial, J. Weston, http://goo.gl/19ywcj
[Dual formulation] ( )
kernel function
kernel function

19
GAUSSIAN PROCESS REGRESSION (GPR)
https://youtu.be/YqhLnCm0KXY
https://youtu.be/kvPmArtVoFE
• Gaussian Distribution
• Multivariate regression likelihood
posterior
prior
likelihood
prediction conditioning the joint distribution of the observed & predicted values
https://goo.gl/EO54WN
http://goo.gl/XvOOmf

20
DIMENSION REDUCTION
[Original space] [Feature space]
low dim. high dim.
high dim. low dim.
𝑋 → ∅(𝑋)
• Principal Component Analysis
: Find the best orthogonal axes
(=principal components) which
maximize the variance of the data
Y = P X
* The rows in P are m largest eigenvectors
of
1
𝑁
𝑋𝑋 𝑇
(covariance matrix)

21
DIMENSION REDUCTION
http://jbhuang0604.blogspot.kr/2013/04/miss-korea-2013-contestants-face.html

22
SUMMARY - PART 1
• Machine Learning
- Tasks : Classification, Regression, Clustering, etc.
- Performance : 0-1 loss, L2 loss, etc.
- Experience : labeled data, unlabelled data
• Machine Learning Process
(1) Select a parametric / nonparametric model
(2) Set a performance measurement including regularization term
(3) Training data (optimizing parameters) until validation error increases
(4) Evaluate the final performance using test set
• Nonparametric model : Support Vector Machine, Gaussian Process Regression
• Dimension reduction : used as pre-processing data

기계학습(Machine learning) 입문하기

More Related Content

What's hot

Viewers also liked

Similar to 기계학습(Machine learning) 입문하기

More from Terry Taewoong Um

Recently uploaded

기계학습(Machine learning) 입문하기