SlideShare a Scribd company logo
Machine Learning Programming
BDA712-00
Lecturer: Josué Obregón PhD
Kyung Hee University
Department of Big Data Analytics
October 12, 2022
Logistic Regression II:
Model validation, image recognition and multiclass classification
1
Machine Learning Programming, KHU
Your first learning
program
Building a tiny
supervised learning
program
Hyperspace!
Multiple linear
regression
Getting real
Recognize a single digit
using MNSIT
A discerning
machine
From regression to
classification
Walking the
gradient
Gradient descent
algorithm
Previously, in our course…
Your first learning
program
Building a tiny
supervised learning
program
Hyperspace!
Multiple linear
regression
Getting real
Recognize a single digit
using MNSIT
A discerning
machine
From regression to
classification
Walking the
gradient
Gradient descent
algorithm
And today…
Today's agenda
• Model evaluation and selection
• Training vs.Testing
• MINST dataset
• Data input format
• Recognizing a single digit
• Data preprocessing and encoding
• Going multiclass
• Intuition behind the loss function
• Transforming linear regression to logistic regression
Machine Learning Programming, KHU 4
How should we think about model selection?
One of the central themes of this class, and Machine Learning is:
Generalizability: We want to construct models that generalize
well to unseen data
• i.e.,We want to:
1
2
Add variables/flexibility as long as doing so helps capture meaningful
trends in the data (avoid underfitting)
Ignore meaningless random fluctuations in the data (avoid overfitting)
Machine Learning Programming, KHU 5
How should we think about model selection?
Let’s remind ourselves of the first CentralTheme of this class.
1. Generalizability: We want to construct models that generalize
well to unseen data
• i.e.,We want to:
1
2
Add variables/flexibility as long as doing so helps capture meaningful
trends in the data (avoid underfitting)
Ignore meaningless random fluctuations in the day (avoid overfitting)
Machine Learning Programming, KHU 6
Assessing Model Performance
• Suppose we fit a model ̂
𝑓𝑓 𝑥𝑥 to some training data: Train = 𝑥𝑥 𝑖𝑖
, 𝑦𝑦(𝑖𝑖)
𝑖𝑖=1
𝑛𝑛
• We want to assess how well ̂
𝑓𝑓 performs
• We can compute the average squared prediction error over Train
𝑀𝑀𝑀𝑀𝐸𝐸𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 1
𝑛𝑛
∑𝑖𝑖=1
𝑛𝑛
𝑦𝑦(𝑖𝑖)− ̂
𝑓𝑓 𝑥𝑥 𝑖𝑖
2
• But this may push us towards more overfit models.
• Instead,we should compute it using fresh test data: Train = 𝑥𝑥 𝑖𝑖
, 𝑦𝑦(𝑖𝑖)
𝑖𝑖=1
𝑚𝑚
𝑀𝑀𝑀𝑀𝐸𝐸𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 1
𝑛𝑛
∑𝑖𝑖=1
𝑚𝑚
𝑦𝑦(𝑖𝑖)− ̂
𝑓𝑓 𝑥𝑥 𝑖𝑖
2
• This would tell us if ̂
𝑓𝑓 generalizes well to new data
Machine Learning Programming, KHU 7
Assessing Model Accuracy:Training Error vs.Testing Error
Here are three different models fit to the same small Train data set. Which of these three
is the best model?
Machine Learning Programming, KHU 8
Assessing Model Accuracy:Training Error vs.Testing Error
Model 1 2 3
MSETrain = RSS/n 23.2 5.2 7.5
Machine Learning Programming, KHU 9
Assessing Model Accuracy:Training Error vs.Testing Error
Here are some new observations,which form our Test data. How well do our models fit the
Test data?
Solid green points: Test data Open grey circles: Train data
Machine Learning Programming, KHU 10
Assessing Model Accuracy:Training Error vs.Testing Error
Model 1 2 3
MSETrain 23.2 5.2 7.5
MSETest 24.6 10.3 7.0
Machine Learning Programming, KHU 11
Assessing Model Accuracy
• As we increase the flexibility of our model, our training set error always decreases
• The same is not true for test set error
• The test set error will decrease as we add flexibility that helps to capture useful
trends
• As we add too much flexibility, the test set error will begin to increase
due to model overfitting
Machine Learning Programming, KHU 12
Computer vision tasks
Machine Learning Programming, KHU 13
Image classification
https://medium.com/analytics-vidhya/image-classification-
vs-object-detection-vs-image-segmentation-f36db85fe81
MNIST Data
• MNIST is a collection of labeled images that’s been assembled
specifically for supervised learning.
• Its name stands for “Modified NIST,” because it’s a remix of earlier data from
the National Institute of Standards and Technology.
• MNIST contains images of handwritten digits, labeled with their
numerical values.
• 60,000 images for training and 10,000 for testing
Machine Learning Programming, KHU 14
MNIST Data
• Digits are made up of 28 by 28 grayscale pixels, each represented by
one byte.
• In MNIST’s grayscale, 0 stands for “perfect background white,” and
255 stands for “perfect foreground black.”
Machine Learning Programming, KHU 15
How to interpret image data
Machine Learning Programming, KHU 16
https://dev.to/sandeepbalachandran/machine-
learning-going-furthur-with-cnn-part-2-41km
Preparing the input Matrices
Machine Learning Programming, KHU 17
28×28 image
Flatten or reshape the 2D
matrix into a 1D vector 0 0 0 235 .. .. 1 2 1 0
We will get a 784 sized 1D vector
Add the bias column
Input for our logistic
regression algorithm
Preparing the input Matrices
Machine Learning Programming, KHU 18
Let’s get real (Lab Session 06)
Goal: Build a program on top of our previous implementation to use the
MNIST dataset as input and classify the images according to the digits
from 0 to 9.Additionally, check the generalization capabilities of our
model by checking the performance on unseen data (test set).
Let’s do it!
• https://classroom.github.com/a/R-cw8Rn-
Machine Learning Programming, KHU 19
Going Multiclass
Machine Learning Programming, KHU 20
One-hot encoding
Machine Learning Programming, KHU 21
Acknowledgements
Some of the lectures notes for this class feature content borrowed with
or without modification from the following sources:
• 95-791Data Mining Carneige Mellon University, Lecture notes (Prof.
Alexandra Chouldechova)
• An Introduction to Statistical Learning, with applications in R (Springer, 2013)
with permission from the authors: G. James, D. Witten, T. Hastie and R.
Tibshirani
• Machine learning online course from Andrew Ng
Machine Learning Programming, KHU 22

More Related Content

Similar to Session 6.pdf

Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
Manish Pandey
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
Manish Pandey
 
House price prediction
House price predictionHouse price prediction
House price prediction
SabahBegum
 
House price prediction
House price predictionHouse price prediction
House price prediction
SabahBegum
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Seldon
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Seldon
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
csandit
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
csandit
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
AI Frontiers
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
AI Frontiers
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
TigerGraph
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
TigerGraph
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
Mokhtar SELLAMI
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
Mokhtar SELLAMI
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
Dev Raj Gautam
 
G. Barcaroli, The use of machine learning in official statistics
G. Barcaroli, The use of machine learning in official statisticsG. Barcaroli, The use of machine learning in official statistics
G. Barcaroli, The use of machine learning in official statistics
Istituto nazionale di statistica
 

Similar to Session 6.pdf (20)

Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
House price prediction
House price predictionHouse price prediction
House price prediction
 
House price prediction
House price predictionHouse price prediction
House price prediction
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
presentationIDC - 14MAY2015
presentationIDC - 14MAY2015presentationIDC - 14MAY2015
presentationIDC - 14MAY2015
 
presentationIDC - 14MAY2015
presentationIDC - 14MAY2015presentationIDC - 14MAY2015
presentationIDC - 14MAY2015
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
 
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
CARI-2020, Application of LSTM architectures for next frame forecasting in Se...
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
 
G. Barcaroli, The use of machine learning in official statistics
G. Barcaroli, The use of machine learning in official statisticsG. Barcaroli, The use of machine learning in official statistics
G. Barcaroli, The use of machine learning in official statistics
 

Recently uploaded

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
chanes7
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 

Recently uploaded (20)

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 

Session 6.pdf

  • 1. Machine Learning Programming BDA712-00 Lecturer: Josué Obregón PhD Kyung Hee University Department of Big Data Analytics October 12, 2022 Logistic Regression II: Model validation, image recognition and multiclass classification 1 Machine Learning Programming, KHU
  • 2. Your first learning program Building a tiny supervised learning program Hyperspace! Multiple linear regression Getting real Recognize a single digit using MNSIT A discerning machine From regression to classification Walking the gradient Gradient descent algorithm Previously, in our course…
  • 3. Your first learning program Building a tiny supervised learning program Hyperspace! Multiple linear regression Getting real Recognize a single digit using MNSIT A discerning machine From regression to classification Walking the gradient Gradient descent algorithm And today…
  • 4. Today's agenda • Model evaluation and selection • Training vs.Testing • MINST dataset • Data input format • Recognizing a single digit • Data preprocessing and encoding • Going multiclass • Intuition behind the loss function • Transforming linear regression to logistic regression Machine Learning Programming, KHU 4
  • 5. How should we think about model selection? One of the central themes of this class, and Machine Learning is: Generalizability: We want to construct models that generalize well to unseen data • i.e.,We want to: 1 2 Add variables/flexibility as long as doing so helps capture meaningful trends in the data (avoid underfitting) Ignore meaningless random fluctuations in the data (avoid overfitting) Machine Learning Programming, KHU 5
  • 6. How should we think about model selection? Let’s remind ourselves of the first CentralTheme of this class. 1. Generalizability: We want to construct models that generalize well to unseen data • i.e.,We want to: 1 2 Add variables/flexibility as long as doing so helps capture meaningful trends in the data (avoid underfitting) Ignore meaningless random fluctuations in the day (avoid overfitting) Machine Learning Programming, KHU 6
  • 7. Assessing Model Performance • Suppose we fit a model ̂ 𝑓𝑓 𝑥𝑥 to some training data: Train = 𝑥𝑥 𝑖𝑖 , 𝑦𝑦(𝑖𝑖) 𝑖𝑖=1 𝑛𝑛 • We want to assess how well ̂ 𝑓𝑓 performs • We can compute the average squared prediction error over Train 𝑀𝑀𝑀𝑀𝐸𝐸𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 1 𝑛𝑛 ∑𝑖𝑖=1 𝑛𝑛 𝑦𝑦(𝑖𝑖)− ̂ 𝑓𝑓 𝑥𝑥 𝑖𝑖 2 • But this may push us towards more overfit models. • Instead,we should compute it using fresh test data: Train = 𝑥𝑥 𝑖𝑖 , 𝑦𝑦(𝑖𝑖) 𝑖𝑖=1 𝑚𝑚 𝑀𝑀𝑀𝑀𝐸𝐸𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 1 𝑛𝑛 ∑𝑖𝑖=1 𝑚𝑚 𝑦𝑦(𝑖𝑖)− ̂ 𝑓𝑓 𝑥𝑥 𝑖𝑖 2 • This would tell us if ̂ 𝑓𝑓 generalizes well to new data Machine Learning Programming, KHU 7
  • 8. Assessing Model Accuracy:Training Error vs.Testing Error Here are three different models fit to the same small Train data set. Which of these three is the best model? Machine Learning Programming, KHU 8
  • 9. Assessing Model Accuracy:Training Error vs.Testing Error Model 1 2 3 MSETrain = RSS/n 23.2 5.2 7.5 Machine Learning Programming, KHU 9
  • 10. Assessing Model Accuracy:Training Error vs.Testing Error Here are some new observations,which form our Test data. How well do our models fit the Test data? Solid green points: Test data Open grey circles: Train data Machine Learning Programming, KHU 10
  • 11. Assessing Model Accuracy:Training Error vs.Testing Error Model 1 2 3 MSETrain 23.2 5.2 7.5 MSETest 24.6 10.3 7.0 Machine Learning Programming, KHU 11
  • 12. Assessing Model Accuracy • As we increase the flexibility of our model, our training set error always decreases • The same is not true for test set error • The test set error will decrease as we add flexibility that helps to capture useful trends • As we add too much flexibility, the test set error will begin to increase due to model overfitting Machine Learning Programming, KHU 12
  • 13. Computer vision tasks Machine Learning Programming, KHU 13 Image classification https://medium.com/analytics-vidhya/image-classification- vs-object-detection-vs-image-segmentation-f36db85fe81
  • 14. MNIST Data • MNIST is a collection of labeled images that’s been assembled specifically for supervised learning. • Its name stands for “Modified NIST,” because it’s a remix of earlier data from the National Institute of Standards and Technology. • MNIST contains images of handwritten digits, labeled with their numerical values. • 60,000 images for training and 10,000 for testing Machine Learning Programming, KHU 14
  • 15. MNIST Data • Digits are made up of 28 by 28 grayscale pixels, each represented by one byte. • In MNIST’s grayscale, 0 stands for “perfect background white,” and 255 stands for “perfect foreground black.” Machine Learning Programming, KHU 15
  • 16. How to interpret image data Machine Learning Programming, KHU 16 https://dev.to/sandeepbalachandran/machine- learning-going-furthur-with-cnn-part-2-41km
  • 17. Preparing the input Matrices Machine Learning Programming, KHU 17 28×28 image Flatten or reshape the 2D matrix into a 1D vector 0 0 0 235 .. .. 1 2 1 0 We will get a 784 sized 1D vector Add the bias column Input for our logistic regression algorithm
  • 18. Preparing the input Matrices Machine Learning Programming, KHU 18
  • 19. Let’s get real (Lab Session 06) Goal: Build a program on top of our previous implementation to use the MNIST dataset as input and classify the images according to the digits from 0 to 9.Additionally, check the generalization capabilities of our model by checking the performance on unseen data (test set). Let’s do it! • https://classroom.github.com/a/R-cw8Rn- Machine Learning Programming, KHU 19
  • 20. Going Multiclass Machine Learning Programming, KHU 20
  • 21. One-hot encoding Machine Learning Programming, KHU 21
  • 22. Acknowledgements Some of the lectures notes for this class feature content borrowed with or without modification from the following sources: • 95-791Data Mining Carneige Mellon University, Lecture notes (Prof. Alexandra Chouldechova) • An Introduction to Statistical Learning, with applications in R (Springer, 2013) with permission from the authors: G. James, D. Witten, T. Hastie and R. Tibshirani • Machine learning online course from Andrew Ng Machine Learning Programming, KHU 22