How to Send Pro Forma Invoice to Your Customers in Odoo 17
Session 6.pdf
1. Machine Learning Programming
BDA712-00
Lecturer: Josué Obregón PhD
Kyung Hee University
Department of Big Data Analytics
October 12, 2022
Logistic Regression II:
Model validation, image recognition and multiclass classification
1
Machine Learning Programming, KHU
2. Your first learning
program
Building a tiny
supervised learning
program
Hyperspace!
Multiple linear
regression
Getting real
Recognize a single digit
using MNSIT
A discerning
machine
From regression to
classification
Walking the
gradient
Gradient descent
algorithm
Previously, in our course…
3. Your first learning
program
Building a tiny
supervised learning
program
Hyperspace!
Multiple linear
regression
Getting real
Recognize a single digit
using MNSIT
A discerning
machine
From regression to
classification
Walking the
gradient
Gradient descent
algorithm
And today…
4. Today's agenda
• Model evaluation and selection
• Training vs.Testing
• MINST dataset
• Data input format
• Recognizing a single digit
• Data preprocessing and encoding
• Going multiclass
• Intuition behind the loss function
• Transforming linear regression to logistic regression
Machine Learning Programming, KHU 4
5. How should we think about model selection?
One of the central themes of this class, and Machine Learning is:
Generalizability: We want to construct models that generalize
well to unseen data
• i.e.,We want to:
1
2
Add variables/flexibility as long as doing so helps capture meaningful
trends in the data (avoid underfitting)
Ignore meaningless random fluctuations in the data (avoid overfitting)
Machine Learning Programming, KHU 5
6. How should we think about model selection?
Let’s remind ourselves of the first CentralTheme of this class.
1. Generalizability: We want to construct models that generalize
well to unseen data
• i.e.,We want to:
1
2
Add variables/flexibility as long as doing so helps capture meaningful
trends in the data (avoid underfitting)
Ignore meaningless random fluctuations in the day (avoid overfitting)
Machine Learning Programming, KHU 6
7. Assessing Model Performance
• Suppose we fit a model ̂
𝑓𝑓 𝑥𝑥 to some training data: Train = 𝑥𝑥 𝑖𝑖
, 𝑦𝑦(𝑖𝑖)
𝑖𝑖=1
𝑛𝑛
• We want to assess how well ̂
𝑓𝑓 performs
• We can compute the average squared prediction error over Train
𝑀𝑀𝑀𝑀𝐸𝐸𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 1
𝑛𝑛
∑𝑖𝑖=1
𝑛𝑛
𝑦𝑦(𝑖𝑖)− ̂
𝑓𝑓 𝑥𝑥 𝑖𝑖
2
• But this may push us towards more overfit models.
• Instead,we should compute it using fresh test data: Train = 𝑥𝑥 𝑖𝑖
, 𝑦𝑦(𝑖𝑖)
𝑖𝑖=1
𝑚𝑚
𝑀𝑀𝑀𝑀𝐸𝐸𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 1
𝑛𝑛
∑𝑖𝑖=1
𝑚𝑚
𝑦𝑦(𝑖𝑖)− ̂
𝑓𝑓 𝑥𝑥 𝑖𝑖
2
• This would tell us if ̂
𝑓𝑓 generalizes well to new data
Machine Learning Programming, KHU 7
8. Assessing Model Accuracy:Training Error vs.Testing Error
Here are three different models fit to the same small Train data set. Which of these three
is the best model?
Machine Learning Programming, KHU 8
9. Assessing Model Accuracy:Training Error vs.Testing Error
Model 1 2 3
MSETrain = RSS/n 23.2 5.2 7.5
Machine Learning Programming, KHU 9
10. Assessing Model Accuracy:Training Error vs.Testing Error
Here are some new observations,which form our Test data. How well do our models fit the
Test data?
Solid green points: Test data Open grey circles: Train data
Machine Learning Programming, KHU 10
11. Assessing Model Accuracy:Training Error vs.Testing Error
Model 1 2 3
MSETrain 23.2 5.2 7.5
MSETest 24.6 10.3 7.0
Machine Learning Programming, KHU 11
12. Assessing Model Accuracy
• As we increase the flexibility of our model, our training set error always decreases
• The same is not true for test set error
• The test set error will decrease as we add flexibility that helps to capture useful
trends
• As we add too much flexibility, the test set error will begin to increase
due to model overfitting
Machine Learning Programming, KHU 12
14. MNIST Data
• MNIST is a collection of labeled images that’s been assembled
specifically for supervised learning.
• Its name stands for “Modified NIST,” because it’s a remix of earlier data from
the National Institute of Standards and Technology.
• MNIST contains images of handwritten digits, labeled with their
numerical values.
• 60,000 images for training and 10,000 for testing
Machine Learning Programming, KHU 14
15. MNIST Data
• Digits are made up of 28 by 28 grayscale pixels, each represented by
one byte.
• In MNIST’s grayscale, 0 stands for “perfect background white,” and
255 stands for “perfect foreground black.”
Machine Learning Programming, KHU 15
16. How to interpret image data
Machine Learning Programming, KHU 16
https://dev.to/sandeepbalachandran/machine-
learning-going-furthur-with-cnn-part-2-41km
17. Preparing the input Matrices
Machine Learning Programming, KHU 17
28×28 image
Flatten or reshape the 2D
matrix into a 1D vector 0 0 0 235 .. .. 1 2 1 0
We will get a 784 sized 1D vector
Add the bias column
Input for our logistic
regression algorithm
19. Let’s get real (Lab Session 06)
Goal: Build a program on top of our previous implementation to use the
MNIST dataset as input and classify the images according to the digits
from 0 to 9.Additionally, check the generalization capabilities of our
model by checking the performance on unseen data (test set).
Let’s do it!
• https://classroom.github.com/a/R-cw8Rn-
Machine Learning Programming, KHU 19
22. Acknowledgements
Some of the lectures notes for this class feature content borrowed with
or without modification from the following sources:
• 95-791Data Mining Carneige Mellon University, Lecture notes (Prof.
Alexandra Chouldechova)
• An Introduction to Statistical Learning, with applications in R (Springer, 2013)
with permission from the authors: G. James, D. Witten, T. Hastie and R.
Tibshirani
• Machine learning online course from Andrew Ng
Machine Learning Programming, KHU 22