SlideShare a Scribd company logo
Supervised
Learning
Orozco Hsu
2022-05-16 1
About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2
Tutorial
Content
3
Image Classification using Logistic Regression
Build a supervised model and prediction
Home work
What is the supervised learning
Code
• Download code
• https://github.com/orozcohsu/ntunhs_2023_01
• Folder/file
• 20230516_02
4
Add-ons
5
What is the supervised learning
6
Supervised learning vs. Unsupervised learning
• Supervised learning: discover patterns in the data that relate data
attributes with a target (class labeled) attribute.
• These patterns are then utilized to predict the values of the target attribute in
future data instances.
• Unsupervised learning: The data have no target attribute.
• We want to explore the data to find some intrinsic structures in them.
• Classic supervised learning algorithm
• Classification
• Regression
7
Supervised learning
8
Ref: https://www.tibco.com/reference-center/what-is-supervised-learning
What is Classification in Supervised Learning?
• Classification is where an algorithm is trained to classify input data on
discrete variables.
• During training, algorithms are given training input data with a class
label. For example, training data might consist of the last credit card
bills of a set of customers, labeled with whether they made a future
purchase or not.
• When a new customer’s credit balance is presented to the algorithm,
it classifies the customer to either will purchase or will not purchase
group.
9
What is Regression in Supervised Learning?
• Regression is a supervised learning method where an algorithm is
trained to predict an output from a continuous range of possible
values. For example, real estate training data would take note of the
location, area, and other relevant parameters. The output is the price
of the specific real estate.
• In regression, an algorithm needs to identify a functional relationship
between the input parameters and the output.
• The output value is not discrete like in classification, instead it is a
function of the continuous outputs.
10
Real-life Applications of Classification
• Binary classification (Most companies use)
• Spam detection
• Churn prediction
• Conversion prediction
• Imbalanced Classification
• Fraud detection: In the labeled data set used for training, only a small number of
inputs are labeled as a fraud.
• Medical diagnostics: In a large pool of samples, ones with a positive case of a disease
might be far less.
• Multi-class Classification
• Face classification: Based on the training data, a model categorizes a photo and maps
it to a specific person.
• Email classification: Multi-class classification is used to segregate emails into various
categories – social, education, work, and family.
11
Real-life Applications of Regression
• Linear regression
• It can be used to predict values within a continuous
range, (e.g. sales, price forecasting) or classifying
them into categories (e.g. cat, dog - logistic
regression)
• Polynomial regression
• It is used for a more complex data set that will not
fit neatly into a linear regression. An algorithm is
trained with a complex, labeled data set that may
not fit well under a straight line regression.
12
Image Classification using Logistic Regression
13
Image Classification using Logistic Regression
14
Unzip the Images.zip dataset
Image Classification using Logistic Regression
• Embedder:
• Inception V3: Google’s Inception v3 model trained on ImageNet.
• SqueezeNet: Deep model for image recognition that achieves AlexNet-level
accuracy on ImageNet with 50x fewer parameter.
• VGG-16: 16-layer image recognition model trained on ImageNet.
• Vgg-19: 19-layer image recognition model trained on ImageNet.
• Painter: A model tained to predict painters from artwork images.
• DeepLoc: A model trained to analyze yeast cell images.
• openface: Face recognition model trained on FaceScrub and CASIA-WebFace
dataset.
• http://vintage.winklerbros.net/facescrub.html
15
ImageNet/ Inception V3
• ImageNet is an image database. The images in the database are
organized into a hierarchy, with each node of the hierarchy depicted
by hundreds and thousands of images.
• Sample size of Training data: 1 million
• Sample size of validation data: 50000
• Number of classes: 1000
16
Outlier detection
17
Data Preprocess (outlier detection)
18
Use iris.tab dataset
Outlier Detection
• Many applications require being
able to decide whether a new
observation belongs to the same
distribution as existing
observations (it is an inlier), or
should be considered as different
(it is an outlier). Often, this ability
is used to clean real data sets.
19
Ref: https://scikit-learn.org/stable/modules/outlier_detection.html#outlier-detection
Evaluation with outliers/ inliers
20
With outliers Without outliers
Tree model
21
Model and explanation
22
Use iris.tab dataset
Model and explanation (Tree model)
23
What is the most significant variable?
Tree also allows the numeric target
24
Use housing.tab dataset
Tree model and prediction
25
26
Open model_build_on_preduction.ows
• Main concepts:
• Data exploration
• Feature Statistics
• Rank
• Data Preprocess
• Preprocess
• Data Split
• Data Sampler
• Model
• Tree/ Tree Viewer
• Save Model/ Load Model
• Test and Score
• Confusion Matrix
• Prediction
Homework
• Please attempt to apply the below-mentioned models to your own
binary dataset, and endeavor to identify the model with the optimal
performances as well as the most significant variables. (in next page)
• Furthermore, it is advised to elucidate the underlying factors of model;
you should include those subsections as below:
• Introduction to your dataset
• Data exploration
• Model evaluation
• Conclusion
• Use PPT with texts and illustrations to present your observations.
27
28

More Related Content

Similar to Supervised Learning

H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Rahul Jain
 
DATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGEDATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGENeeraj Goswami
 
High time to add machine learning to your information security stack
High time to add machine learning to your information security stackHigh time to add machine learning to your information security stack
High time to add machine learning to your information security stack
Minhaz A V
 
Cluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxCluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptx
infosec train
 
Cluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxCluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptx
Infosectrain3
 
CodeLess Machine Learning
CodeLess Machine LearningCodeLess Machine Learning
CodeLess Machine Learning
Sharjeel Imtiaz
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
Novartis Institutes for BioMedical Research
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
kprasad8
 
Chapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdfChapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdf
AschalewAyele2
 
Data mining
Data miningData mining
Data mining
heba_ahmad
 
Distilling dark knowledge from neural networks
Distilling dark knowledge from neural networksDistilling dark knowledge from neural networks
Distilling dark knowledge from neural networks
Alexander Korbonits
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
Sasha Lazarevic
 
CLUSTER ANALYSIS.pptx
CLUSTER ANALYSIS.pptxCLUSTER ANALYSIS.pptx
CLUSTER ANALYSIS.pptx
Lithal Fragrance
 
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Maninda Edirisooriya
 
Data mining
Data miningData mining
Data mining
pradeepa n
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Lucidworks
 
BIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNINGBIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNING
Umair Shafique
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Alok Singh
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Joaquin Delgado PhD.
 

Similar to Supervised Learning (20)

H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
DATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGEDATA MINING TOOL- ORANGE
DATA MINING TOOL- ORANGE
 
High time to add machine learning to your information security stack
High time to add machine learning to your information security stackHigh time to add machine learning to your information security stack
High time to add machine learning to your information security stack
 
Cluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxCluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptx
 
Cluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptxCluster Analysis in Data Science.pptx
Cluster Analysis in Data Science.pptx
 
CodeLess Machine Learning
CodeLess Machine LearningCodeLess Machine Learning
CodeLess Machine Learning
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
Chapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdfChapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdf
 
Data mining
Data miningData mining
Data mining
 
Distilling dark knowledge from neural networks
Distilling dark knowledge from neural networksDistilling dark knowledge from neural networks
Distilling dark knowledge from neural networks
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
CLUSTER ANALYSIS.pptx
CLUSTER ANALYSIS.pptxCLUSTER ANALYSIS.pptx
CLUSTER ANALYSIS.pptx
 
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
Lecture 2 - Introduction to Machine Learning, a lecture in subject module Sta...
 
Data mining
Data miningData mining
Data mining
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
BIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNINGBIG DATA AND MACHINE LEARNING
BIG DATA AND MACHINE LEARNING
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 

More from FEG

Sequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdfSequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdf
FEG
 
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
FEG
 
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
FEG
 
Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318
FEG
 
2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices
FEG
 
2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch
FEG
 
2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch
FEG
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch
FEG
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules
FEG
 
202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)
FEG
 
202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization
FEG
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
FEG
 
Image Classification (20230411)
Image Classification (20230411)Image Classification (20230411)
Image Classification (20230411)
FEG
 
Google CoLab (20230321)
Google CoLab (20230321)Google CoLab (20230321)
Google CoLab (20230321)
FEG
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning Clustering
FEG
 
Data Visualization in Excel
Data Visualization in ExcelData Visualization in Excel
Data Visualization in Excel
FEG
 
6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf
FEG
 
5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf
FEG
 
4_Regression_analysis.pdf
4_Regression_analysis.pdf4_Regression_analysis.pdf
4_Regression_analysis.pdf
FEG
 
3_Decision_tree.pdf
3_Decision_tree.pdf3_Decision_tree.pdf
3_Decision_tree.pdf
FEG
 

More from FEG (20)

Sequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdfSequence Model pytorch at colab with gpu.pdf
Sequence Model pytorch at colab with gpu.pdf
 
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
學院碩士班_非監督式學習_使用Orange3直接使用_分群_20240417.pdf
 
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
 
Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318
 
2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices
 
2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch
 
2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules
 
202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)
 
202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
 
Image Classification (20230411)
Image Classification (20230411)Image Classification (20230411)
Image Classification (20230411)
 
Google CoLab (20230321)
Google CoLab (20230321)Google CoLab (20230321)
Google CoLab (20230321)
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning Clustering
 
Data Visualization in Excel
Data Visualization in ExcelData Visualization in Excel
Data Visualization in Excel
 
6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf
 
5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf
 
4_Regression_analysis.pdf
4_Regression_analysis.pdf4_Regression_analysis.pdf
4_Regression_analysis.pdf
 
3_Decision_tree.pdf
3_Decision_tree.pdf3_Decision_tree.pdf
3_Decision_tree.pdf
 

Recently uploaded

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 

Recently uploaded (20)

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 

Supervised Learning

  • 2. About me • Education • NCU (MIS)、NCCU (CS) • Work Experience • Telecom big data Innovation • AI projects • Retail marketing technology • User Group • TW Spark User Group • TW Hadoop User Group • Taiwan Data Engineer Association Director • Research • Big Data/ ML/ AIOT/ AI Columnist 2
  • 3. Tutorial Content 3 Image Classification using Logistic Regression Build a supervised model and prediction Home work What is the supervised learning
  • 4. Code • Download code • https://github.com/orozcohsu/ntunhs_2023_01 • Folder/file • 20230516_02 4
  • 6. What is the supervised learning 6
  • 7. Supervised learning vs. Unsupervised learning • Supervised learning: discover patterns in the data that relate data attributes with a target (class labeled) attribute. • These patterns are then utilized to predict the values of the target attribute in future data instances. • Unsupervised learning: The data have no target attribute. • We want to explore the data to find some intrinsic structures in them. • Classic supervised learning algorithm • Classification • Regression 7
  • 9. What is Classification in Supervised Learning? • Classification is where an algorithm is trained to classify input data on discrete variables. • During training, algorithms are given training input data with a class label. For example, training data might consist of the last credit card bills of a set of customers, labeled with whether they made a future purchase or not. • When a new customer’s credit balance is presented to the algorithm, it classifies the customer to either will purchase or will not purchase group. 9
  • 10. What is Regression in Supervised Learning? • Regression is a supervised learning method where an algorithm is trained to predict an output from a continuous range of possible values. For example, real estate training data would take note of the location, area, and other relevant parameters. The output is the price of the specific real estate. • In regression, an algorithm needs to identify a functional relationship between the input parameters and the output. • The output value is not discrete like in classification, instead it is a function of the continuous outputs. 10
  • 11. Real-life Applications of Classification • Binary classification (Most companies use) • Spam detection • Churn prediction • Conversion prediction • Imbalanced Classification • Fraud detection: In the labeled data set used for training, only a small number of inputs are labeled as a fraud. • Medical diagnostics: In a large pool of samples, ones with a positive case of a disease might be far less. • Multi-class Classification • Face classification: Based on the training data, a model categorizes a photo and maps it to a specific person. • Email classification: Multi-class classification is used to segregate emails into various categories – social, education, work, and family. 11
  • 12. Real-life Applications of Regression • Linear regression • It can be used to predict values within a continuous range, (e.g. sales, price forecasting) or classifying them into categories (e.g. cat, dog - logistic regression) • Polynomial regression • It is used for a more complex data set that will not fit neatly into a linear regression. An algorithm is trained with a complex, labeled data set that may not fit well under a straight line regression. 12
  • 13. Image Classification using Logistic Regression 13
  • 14. Image Classification using Logistic Regression 14 Unzip the Images.zip dataset
  • 15. Image Classification using Logistic Regression • Embedder: • Inception V3: Google’s Inception v3 model trained on ImageNet. • SqueezeNet: Deep model for image recognition that achieves AlexNet-level accuracy on ImageNet with 50x fewer parameter. • VGG-16: 16-layer image recognition model trained on ImageNet. • Vgg-19: 19-layer image recognition model trained on ImageNet. • Painter: A model tained to predict painters from artwork images. • DeepLoc: A model trained to analyze yeast cell images. • openface: Face recognition model trained on FaceScrub and CASIA-WebFace dataset. • http://vintage.winklerbros.net/facescrub.html 15
  • 16. ImageNet/ Inception V3 • ImageNet is an image database. The images in the database are organized into a hierarchy, with each node of the hierarchy depicted by hundreds and thousands of images. • Sample size of Training data: 1 million • Sample size of validation data: 50000 • Number of classes: 1000 16
  • 18. Data Preprocess (outlier detection) 18 Use iris.tab dataset
  • 19. Outlier Detection • Many applications require being able to decide whether a new observation belongs to the same distribution as existing observations (it is an inlier), or should be considered as different (it is an outlier). Often, this ability is used to clean real data sets. 19 Ref: https://scikit-learn.org/stable/modules/outlier_detection.html#outlier-detection
  • 20. Evaluation with outliers/ inliers 20 With outliers Without outliers
  • 22. Model and explanation 22 Use iris.tab dataset
  • 23. Model and explanation (Tree model) 23 What is the most significant variable?
  • 24. Tree also allows the numeric target 24 Use housing.tab dataset
  • 25. Tree model and prediction 25
  • 26. 26 Open model_build_on_preduction.ows • Main concepts: • Data exploration • Feature Statistics • Rank • Data Preprocess • Preprocess • Data Split • Data Sampler • Model • Tree/ Tree Viewer • Save Model/ Load Model • Test and Score • Confusion Matrix • Prediction
  • 27. Homework • Please attempt to apply the below-mentioned models to your own binary dataset, and endeavor to identify the model with the optimal performances as well as the most significant variables. (in next page) • Furthermore, it is advised to elucidate the underlying factors of model; you should include those subsections as below: • Introduction to your dataset • Data exploration • Model evaluation • Conclusion • Use PPT with texts and illustrations to present your observations. 27
  • 28. 28