SlideShare a Scribd company logo
1 of 36
Introduction to
Machine Learning
(5 ECTS)
Giovanni Di Liberto
Asst. Prof. in Intelligent Systems, SCSS
Room G.15, O’Reilly Institute
Trinity College Dublin, The University of Dublin
Overview previous lectures
2
• Classification
• Evaluation
• Overfitting and Cross-validation
• Chance level
• K-nearest neighbour (KNN)
• Decision tree
Trinity College Dublin, The University of Dublin
Overview lecture
3
• Cross-validation
• Overfitting
• More about Support Vector Machines (SVM)
• Data projection (introduction)
• Introduction to regression
Trinity College Dublin, The University of Dublin 4
Support Vector Machine (SVM)
Linear Binary SVM Classification
- Scenario where the two classes are linearly
separable
- The solid line in the plot on the right represents
the decision boundary of an SVM classifier
- This line separates the two classes + stays as far
away from the closest training instances as
possible
Trinity College Dublin, The University of Dublin 5
Support Vector Machine (SVM)
A more realistic scenario.
We are going to get some errors. We can choose:
Do we prefer having higher precision or higher
recall? We can’t have both, but we can move the
decision boundary to make the solution the best as
possible for our goals.
Trinity College Dublin, The University of Dublin
Overfitting
6
https://towardsdatascience.com/techniques-for-handling-underfitting-
and-overfitting-in-machine-learning-348daa2380b9
Overfitted model: it does not
generalise well!
Maybe some datapoints were bad
measurements or mislabelled
Trinity College Dublin, The University of Dublin
Overfitting
7
Controlling for overfitting
- We want to make sure that our model is working for real. That it generalises.
Not that it works (good classification) because we are overfitting
- To do so, we fit the model on one portion of the data and test it on a
separate portion of the data. This approach controls for overfitting as the
model is evaluated on unseen data (cross-validation)
Preventing overfitting
- More complex models tend to overfit more
- There are strategies to reduce the amount of overfitting (e.g., regularisation,
early stopping)
Trinity College Dublin, The University of Dublin
Cross-validation (controlling for overfitting)
8
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
Class 1
Class 2
Ground truth Training set Test set
Trinity College Dublin, The University of Dublin
Cross-validation
9
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
Class 1
Class 2
Ground truth Training set Test set
Trinity College Dublin, The University of Dublin
Cross-validation
10
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
Class 1
Class 2
Ground truth Training set Test set
The model is overfitting! Too
complex
At least the cross-validation is
controlling for that i.e.,
prediction on the test set is
not very good
Trinity College Dublin, The University of Dublin
k-fold Cross-validation
11
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
Trinity College Dublin, The University of Dublin
Baseline – real vs. ideal
12
- Coin flip:
- 2 classes (head vs. tail)
- 50-50 chance
- Random
- Is that a zero or a one digit?
- 2 classes
- Let’s use a simple linear classifier. We definitely want this classifier to
perform better than chance.
- What is chance? Well, 2 classes.. Isnt’t that a 50-50 chance to get it right?
- Nope. That depends on the probability of encountering a 1 or a 0
- So, let’s say that we have equal number of and ones in the dataset. That
means that we have a 50-50 chance that a random classifier gets it right.
- Yes.. with infinite data
Trinity College Dublin, The University of Dublin
Baseline – real vs. ideal
13
- Small datasets have a higher chance that a random classifier would get it right
by chance
- So, classification results should be compared to a baseline (or chance level)
that is calculated by taking into account the sample size (N)
- We will see that in the coming lectures
- Things get more complicated with multiclass and imbalanced datasets
https://www.discovermagazine.com/mind/machine-learning-exceeding-chance-level-by-chance
Trinity College Dublin, The University of Dublin
Precision vs. recall
14
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Trade-off
Trinity College Dublin, The University of Dublin
ROC curve
15
Trinity College Dublin, The University of Dublin
Classification – evaluation metrics
16
F1-Score = harmonic mean of precision and recall
Precision, recall, and F1-score apply to both binary
balanced, binary imbalanced, and multiclass classification.
Trinity College Dublin, The University of Dublin
Classification in Python
17
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
X is the data matrix
(features)
y is the class (‘five’ or
‘not a five’)
Trinity College Dublin, The University of Dublin 18
Support Vector Machine (SVM)
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Trinity College Dublin, The University of Dublin 19
Support Vector Machine (SVM)
- Some datasets are not even close
to being linearly separable.
- One approach is to use
polynomial features
e.g., x2 = (x1)2
x3 = (x1)3
Trinity College Dublin, The University of Dublin 20
Support Vector Machine (SVM)
- Some datasets are not even close
to being linearly separable.
- One approach is to use
polynomial features
e.g., x2 = (x1)2
x3 = (x1)3
- Kernel methods
https://towardsdatascience.com/the-kernel-trick-c98cdbcaeb3f
Trinity College Dublin, The University of Dublin 21
LDA: Linear Discriminant Analysis
and Data projection
x1
x2
Y ∈ {green,blue}
x2
x1
X: [x1, x2] Sometimes it is easier to look at things from a different angle,
instead of searching for a complicated solution
Trinity College Dublin, The University of Dublin 22
Data projection
x1
x2
Y ∈ {green,blue}
X: [x1, x2]
Xproj = X - [2,0]
Xproj = [x1, x2] - [2,0]
Xproj = [x1-2, x2]
xproj1
xproj2
Trinity College Dublin, The University of Dublin 23
Data projection
x1
x2
Y ∈ {green,blue}
X: [x1, x2]
Xproj = X - [2,3]
Xproj = [x1, x2] - [2,3]
Xproj = [x1-2, x2-3]
xproj1
xproj2
Trinity College Dublin, The University of Dublin 24
Data projection
A projection is a transformation of data points from one axis system to another
x1
x2
xproj1
xproj2
xproj1
xproj2
Trinity College Dublin, The University of Dublin 25
Data projection
x1
x2
x1
x2
Bad projection Good projection
Trinity College Dublin, The University of Dublin 26
x1
x2
Good projection
Data projection
LDA: Linear Discriminant Analysis
Find the axis that:
- Maximises the variance of the class
means (between-class)
- Minimises the within-class variance
Trinity College Dublin, The University of Dublin 27
x1
x2
Good projection
Data projection
xproj
Perfect separability between classes
Trinity College Dublin, The University of Dublin
Discussion
28
Trinity College Dublin, The University of Dublin
Discussion
29
Trinity College Dublin, The University of Dublin
Discussion
30
• How could we design a pothole detector that can map the potholes in
Dublin? What would be the data? How would we use this data to
perform classification and detect the potholes?
Problem/question Data collection
Preprocessing /
cleaning
Analysing
Interpretation /
outcome
Improve
ML
Visualisation Visualisation Visualisation
Trinity College Dublin, The University of Dublin
Supervised Learning
31
y = f(X)
f ynew
Model Training (learning or fit)
Xnew
f y
X
Using the model (test)
known known
unknown known
known unknown
Classification: y is a category/class
Regression: y is a number
Trinity College Dublin, The University of Dublin 32
Regression
Classification Regression
Find decision boundary:
e.g.:
Combination of X > boundary
 y class A
Combination of X < boundary
 y class B
Find decision boundary:
e.g.:
y = Combination of X
Trinity College Dublin, The University of Dublin
Regression
33
X2: inflation
X1: cost of materials
y = avg cost house
Using the past (of x) to
predict the future (of y)
Trinity College Dublin, The University of Dublin 34
Regression
Dependent variable
Independent variables
Trinity College Dublin, The University of Dublin
Classification in Python
35
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
X is the data matrix
(features)
y is the class (‘five’ or
‘not a five’)
Trinity College Dublin, The University of Dublin
Regression in Python
36
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
X is the data matrix
(features)
y is the class (‘five’ or
‘not a five’)

More Related Content

Similar to Machine Learning Intro Course Overview

Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7Roger Barga
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1arogozhnikov
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesXavier Rafael Palou
 
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...YONG ZHENG
 
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...csandit
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxdongchangim30
 
Machine learning ( Part 2 )
Machine learning ( Part 2 )Machine learning ( Part 2 )
Machine learning ( Part 2 )Sunil OS
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
 
Talwalkar mlconf (1)
Talwalkar mlconf (1)Talwalkar mlconf (1)
Talwalkar mlconf (1)MLconf
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningCharles Deledalle
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsChirag Gupta
 
Metric learning ICML2010 tutorial
Metric learning  ICML2010 tutorialMetric learning  ICML2010 tutorial
Metric learning ICML2010 tutorialzukun
 
Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachAlexander Rakhlin
 
Application of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisApplication of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisDr.Pooja Jain
 
Supervised Machine learning Algorithm.pptx
Supervised Machine learning Algorithm.pptxSupervised Machine learning Algorithm.pptx
Supervised Machine learning Algorithm.pptxKing Khalid University
 
supervised-learning.pptx
supervised-learning.pptxsupervised-learning.pptx
supervised-learning.pptxGandhiMathy6
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
 

Similar to Machine Learning Intro Course Overview (20)

Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
Text categorization
Text categorizationText categorization
Text categorization
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniques
 
Cluster Forest
Cluster ForestCluster Forest
Cluster Forest
 
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...
 
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
Machine learning ( Part 2 )
Machine learning ( Part 2 )Machine learning ( Part 2 )
Machine learning ( Part 2 )
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
 
Talwalkar mlconf (1)
Talwalkar mlconf (1)Talwalkar mlconf (1)
Talwalkar mlconf (1)
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, Captioning
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional Experts
 
Metric learning ICML2010 tutorial
Metric learning  ICML2010 tutorialMetric learning  ICML2010 tutorial
Metric learning ICML2010 tutorial
 
Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approach
 
Application of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisApplication of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosis
 
Supervised Machine learning Algorithm.pptx
Supervised Machine learning Algorithm.pptxSupervised Machine learning Algorithm.pptx
Supervised Machine learning Algorithm.pptx
 
supervised-learning.pptx
supervised-learning.pptxsupervised-learning.pptx
supervised-learning.pptx
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 
MAchine learning
MAchine learningMAchine learning
MAchine learning
 

Recently uploaded

KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 

Recently uploaded (20)

KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 

Machine Learning Intro Course Overview

  • 1. Introduction to Machine Learning (5 ECTS) Giovanni Di Liberto Asst. Prof. in Intelligent Systems, SCSS Room G.15, O’Reilly Institute
  • 2. Trinity College Dublin, The University of Dublin Overview previous lectures 2 • Classification • Evaluation • Overfitting and Cross-validation • Chance level • K-nearest neighbour (KNN) • Decision tree
  • 3. Trinity College Dublin, The University of Dublin Overview lecture 3 • Cross-validation • Overfitting • More about Support Vector Machines (SVM) • Data projection (introduction) • Introduction to regression
  • 4. Trinity College Dublin, The University of Dublin 4 Support Vector Machine (SVM) Linear Binary SVM Classification - Scenario where the two classes are linearly separable - The solid line in the plot on the right represents the decision boundary of an SVM classifier - This line separates the two classes + stays as far away from the closest training instances as possible
  • 5. Trinity College Dublin, The University of Dublin 5 Support Vector Machine (SVM) A more realistic scenario. We are going to get some errors. We can choose: Do we prefer having higher precision or higher recall? We can’t have both, but we can move the decision boundary to make the solution the best as possible for our goals.
  • 6. Trinity College Dublin, The University of Dublin Overfitting 6 https://towardsdatascience.com/techniques-for-handling-underfitting- and-overfitting-in-machine-learning-348daa2380b9 Overfitted model: it does not generalise well! Maybe some datapoints were bad measurements or mislabelled
  • 7. Trinity College Dublin, The University of Dublin Overfitting 7 Controlling for overfitting - We want to make sure that our model is working for real. That it generalises. Not that it works (good classification) because we are overfitting - To do so, we fit the model on one portion of the data and test it on a separate portion of the data. This approach controls for overfitting as the model is evaluated on unseen data (cross-validation) Preventing overfitting - More complex models tend to overfit more - There are strategies to reduce the amount of overfitting (e.g., regularisation, early stopping)
  • 8. Trinity College Dublin, The University of Dublin Cross-validation (controlling for overfitting) 8 https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b Class 1 Class 2 Ground truth Training set Test set
  • 9. Trinity College Dublin, The University of Dublin Cross-validation 9 https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b Class 1 Class 2 Ground truth Training set Test set
  • 10. Trinity College Dublin, The University of Dublin Cross-validation 10 https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b Class 1 Class 2 Ground truth Training set Test set The model is overfitting! Too complex At least the cross-validation is controlling for that i.e., prediction on the test set is not very good
  • 11. Trinity College Dublin, The University of Dublin k-fold Cross-validation 11 https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
  • 12. Trinity College Dublin, The University of Dublin Baseline – real vs. ideal 12 - Coin flip: - 2 classes (head vs. tail) - 50-50 chance - Random - Is that a zero or a one digit? - 2 classes - Let’s use a simple linear classifier. We definitely want this classifier to perform better than chance. - What is chance? Well, 2 classes.. Isnt’t that a 50-50 chance to get it right? - Nope. That depends on the probability of encountering a 1 or a 0 - So, let’s say that we have equal number of and ones in the dataset. That means that we have a 50-50 chance that a random classifier gets it right. - Yes.. with infinite data
  • 13. Trinity College Dublin, The University of Dublin Baseline – real vs. ideal 13 - Small datasets have a higher chance that a random classifier would get it right by chance - So, classification results should be compared to a baseline (or chance level) that is calculated by taking into account the sample size (N) - We will see that in the coming lectures - Things get more complicated with multiclass and imbalanced datasets https://www.discovermagazine.com/mind/machine-learning-exceeding-chance-level-by-chance
  • 14. Trinity College Dublin, The University of Dublin Precision vs. recall 14 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 Trade-off
  • 15. Trinity College Dublin, The University of Dublin ROC curve 15
  • 16. Trinity College Dublin, The University of Dublin Classification – evaluation metrics 16 F1-Score = harmonic mean of precision and recall Precision, recall, and F1-score apply to both binary balanced, binary imbalanced, and multiclass classification.
  • 17. Trinity College Dublin, The University of Dublin Classification in Python 17 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 X is the data matrix (features) y is the class (‘five’ or ‘not a five’)
  • 18. Trinity College Dublin, The University of Dublin 18 Support Vector Machine (SVM) “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019
  • 19. Trinity College Dublin, The University of Dublin 19 Support Vector Machine (SVM) - Some datasets are not even close to being linearly separable. - One approach is to use polynomial features e.g., x2 = (x1)2 x3 = (x1)3
  • 20. Trinity College Dublin, The University of Dublin 20 Support Vector Machine (SVM) - Some datasets are not even close to being linearly separable. - One approach is to use polynomial features e.g., x2 = (x1)2 x3 = (x1)3 - Kernel methods https://towardsdatascience.com/the-kernel-trick-c98cdbcaeb3f
  • 21. Trinity College Dublin, The University of Dublin 21 LDA: Linear Discriminant Analysis and Data projection x1 x2 Y ∈ {green,blue} x2 x1 X: [x1, x2] Sometimes it is easier to look at things from a different angle, instead of searching for a complicated solution
  • 22. Trinity College Dublin, The University of Dublin 22 Data projection x1 x2 Y ∈ {green,blue} X: [x1, x2] Xproj = X - [2,0] Xproj = [x1, x2] - [2,0] Xproj = [x1-2, x2] xproj1 xproj2
  • 23. Trinity College Dublin, The University of Dublin 23 Data projection x1 x2 Y ∈ {green,blue} X: [x1, x2] Xproj = X - [2,3] Xproj = [x1, x2] - [2,3] Xproj = [x1-2, x2-3] xproj1 xproj2
  • 24. Trinity College Dublin, The University of Dublin 24 Data projection A projection is a transformation of data points from one axis system to another x1 x2 xproj1 xproj2 xproj1 xproj2
  • 25. Trinity College Dublin, The University of Dublin 25 Data projection x1 x2 x1 x2 Bad projection Good projection
  • 26. Trinity College Dublin, The University of Dublin 26 x1 x2 Good projection Data projection LDA: Linear Discriminant Analysis Find the axis that: - Maximises the variance of the class means (between-class) - Minimises the within-class variance
  • 27. Trinity College Dublin, The University of Dublin 27 x1 x2 Good projection Data projection xproj Perfect separability between classes
  • 28. Trinity College Dublin, The University of Dublin Discussion 28
  • 29. Trinity College Dublin, The University of Dublin Discussion 29
  • 30. Trinity College Dublin, The University of Dublin Discussion 30 • How could we design a pothole detector that can map the potholes in Dublin? What would be the data? How would we use this data to perform classification and detect the potholes? Problem/question Data collection Preprocessing / cleaning Analysing Interpretation / outcome Improve ML Visualisation Visualisation Visualisation
  • 31. Trinity College Dublin, The University of Dublin Supervised Learning 31 y = f(X) f ynew Model Training (learning or fit) Xnew f y X Using the model (test) known known unknown known known unknown Classification: y is a category/class Regression: y is a number
  • 32. Trinity College Dublin, The University of Dublin 32 Regression Classification Regression Find decision boundary: e.g.: Combination of X > boundary  y class A Combination of X < boundary  y class B Find decision boundary: e.g.: y = Combination of X
  • 33. Trinity College Dublin, The University of Dublin Regression 33 X2: inflation X1: cost of materials y = avg cost house Using the past (of x) to predict the future (of y)
  • 34. Trinity College Dublin, The University of Dublin 34 Regression Dependent variable Independent variables
  • 35. Trinity College Dublin, The University of Dublin Classification in Python 35 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 X is the data matrix (features) y is the class (‘five’ or ‘not a five’)
  • 36. Trinity College Dublin, The University of Dublin Regression in Python 36 “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Aurélien Géron, 2019 X is the data matrix (features) y is the class (‘five’ or ‘not a five’)

Editor's Notes

  1. Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,
  2. Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,
  3. Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,
  4. Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,
  5. Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,