SlideShare a Scribd company logo
1 of 34
ML - 2021
Biological Sciences faculty
Biophysics Department
Introduction to Applied Machine Learning
Presented By
Alireza Doustmohammadi
Graduate Student in Bioinformatics
January 2021
Contents
Introduction
Why do we need to prediction?
Central Dogma of Prediction
2 of 34
ML - 2021
Reference
3 of 34
ML - 2021
Outline
4 of 34
Supervised Learning
Unsupervised Learning
Ensemble Learning
ML - 2021
Why do we need to prediction?
5 of 34
ML - 2021
Why do we need to prediction?
6 of 34
Welcome To de Era of
Big Data …..
ML - 2021
Why do we need to prediction?
7 of 34
ML - 2021
Why do we need to prediction?
8 of 34
[https://www.ncbi.nlm.nih.gov/genbank/statistics/]
ML - 2021
Why do we need to prediction?
9 of 34
[https://www.rcsb.org/stats/summary]
Molecular Type X-ray NMR EM Multiple methods Neutron Other Total
Protein (only) 135896 34576 4544 165 67 34 152280
Protein/NA 7177 269 1603 3 0 0 9052
Nucleic acid (only) 2158 1340 53 7 2 1 3561
Other 149 31 3 0 0 0 183
Total 153400 13453 6814 181 69 37 173754
PDB Data Distribution by Experimental Method and Molecular Type:
ML - 2021
Why do we need to prediction?
10 of 34
Basic Concepts &
Nomenclatures
ML - 2021
Central Dogma of Prediction
11 of 34
ML - 2021
Central Dogma of Prediction
12 of 34
ML - 2021
Central Dogma of Prediction
13 of 34
• Defining the Questions
• Data Collection
• Feature Extraction
• Preprocessing & Feature Selection
• Algorithm (Classifier)
• Evaluation
• Redesign the Algorithm (Parameter Tuning)
ML - 2021
Features
14 of 34
✓ Good representation of data
✓ Data Compression
✓ Need to expert’s knowledge
ML - 2021
Data Matrix
15 of 34
ML - 2021
X , Y Relation
16 of 34
ML - 2021
X , Y Relation
17 of 34
ML - 2021
X , Y Relation
18 of 34
ML - 2021
X , Y Relation
19 of 34
ML - 2021
Reducible and irreducible error
20 of 34
Reducible Irreducible
Goal: Minimizing the reducible error
ML - 2021
21 of 34
Preprocessing &
Feature Selection
ML - 2021
Preprocessing & Feature Selection
22 of 34
ML - 2021
Preprocessing & Feature Selection
23 of 34
ML - 2021
Data Challenges:
➢ Miss Value
➢ Low-frequency variant Features
➢ Outliers
Preprocessing & Feature Selection
24 of 34
ML - 2021
Blessing and Curse of Dimensionality
25 of 34
Algorithms
ML - 2021
Most Common Algorithms
26 of 34
KNN Regression MLP
ANN SVM
Decision
Tree
Random
Forest
Bayesian
Net
ML - 2021
Central Dogma of Prediction
27 of 34
ML - 2021
Algorithm (Model Selection): over fitting
28 of 34
ML - 2021
“All models are wrong,
but some are useful.”
George Box, British Statistician
1919-213
Algorithm (Model Selection): over fitting
29 of 34
ML - 2021
Algorithm (Model Selection): over fitting
30 of 34
Increasing the size of the data set may reduce the over-fitting
ML - 2021
Algorithm (Model Selection): over fitting
31 of 34
Bias – variance Trade off
ML - 2021
Increase Flexibility:
▪ Bias tends to initially decrease
faster than variance increases
▪ At some point has little impact on
the bias but starts to significantly
increase the variance.
The Bias-Variance Trade-off
32 of 34
ML - 2021
In Sample Error Vs on Sample error
33 of 34
ML - 2021
• Train data
• Bias
In sample error
• Test data
• Variance
Out sample error
The peaking paradox
34 of 34
ML - 2021

More Related Content

Similar to Introduction to Applied Machine Learning

Informs2020 using machine learning to identify the factors of people's mobi...
Informs2020   using machine learning to identify the factors of people's mobi...Informs2020   using machine learning to identify the factors of people's mobi...
Informs2020 using machine learning to identify the factors of people's mobi...
Alex Gilgur
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
multimediaeval
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictions
Anton Kulesh
 
Mobile data offloading
Mobile data offloadingMobile data offloading
Mobile data offloading
Akshay Salunkhe
 
Dm
DmDm
AI approaches in healthcare - targeting precise and personalized medicine
AI approaches in healthcare - targeting precise and personalized medicine AI approaches in healthcare - targeting precise and personalized medicine
AI approaches in healthcare - targeting precise and personalized medicine
DayOne
 
Association mapping, GWAS, Mapping, natural population mapping
Association mapping, GWAS, Mapping, natural population mappingAssociation mapping, GWAS, Mapping, natural population mapping
Association mapping, GWAS, Mapping, natural population mapping
Mahesh Biradar
 
Esem2010 shihab
Esem2010 shihabEsem2010 shihab
Esem2010 shihabSAIL_QU
 
Example 33.2 Principal Factor Analysis This example uses t.docx
Example 33.2 Principal Factor Analysis This example uses t.docxExample 33.2 Principal Factor Analysis This example uses t.docx
Example 33.2 Principal Factor Analysis This example uses t.docx
SANSKAR20
 
Linear Probability Models and Big Data: Prediction, Inference and Selection Bias
Linear Probability Models and Big Data: Prediction, Inference and Selection BiasLinear Probability Models and Big Data: Prediction, Inference and Selection Bias
Linear Probability Models and Big Data: Prediction, Inference and Selection Bias
Suneel Babu Chatla
 
A Hybrid Formulation between Differential Evolution and Simulated Annealing A...
A Hybrid Formulation between Differential Evolution and Simulated Annealing A...A Hybrid Formulation between Differential Evolution and Simulated Annealing A...
A Hybrid Formulation between Differential Evolution and Simulated Annealing A...
TELKOMNIKA JOURNAL
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
Krishnaram Kenthapadi
 
25 17 dec16 13743 28032-1-sm(edit)
25 17 dec16 13743 28032-1-sm(edit)25 17 dec16 13743 28032-1-sm(edit)
25 17 dec16 13743 28032-1-sm(edit)
IAESIJEECS
 
Khamis 2015 - Comparative Assessment of Machine-Learning Scoring Functions on...
Khamis 2015 - Comparative Assessment of Machine-Learning Scoring Functions on...Khamis 2015 - Comparative Assessment of Machine-Learning Scoring Functions on...
Khamis 2015 - Comparative Assessment of Machine-Learning Scoring Functions on...Mohamed AbdElAziz Khamis
 
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDIMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Rabi Das
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
Dr. C.V. Suresh Babu
 
IRJET - Prediction of Risk Factor of the Patient with Hepatocellular Carcinom...
IRJET - Prediction of Risk Factor of the Patient with Hepatocellular Carcinom...IRJET - Prediction of Risk Factor of the Patient with Hepatocellular Carcinom...
IRJET - Prediction of Risk Factor of the Patient with Hepatocellular Carcinom...
IRJET Journal
 
Triangular Learner Model
Triangular Learner ModelTriangular Learner Model
Triangular Learner Model
Loc Nguyen
 
Deep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdfDeep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdf
asdfasdf214078
 
LRP for hand gesture recogntion.pptx
LRP for hand gesture recogntion.pptxLRP for hand gesture recogntion.pptx
LRP for hand gesture recogntion.pptx
shamim806425
 

Similar to Introduction to Applied Machine Learning (20)

Informs2020 using machine learning to identify the factors of people's mobi...
Informs2020   using machine learning to identify the factors of people's mobi...Informs2020   using machine learning to identify the factors of people's mobi...
Informs2020 using machine learning to identify the factors of people's mobi...
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictions
 
Mobile data offloading
Mobile data offloadingMobile data offloading
Mobile data offloading
 
Dm
DmDm
Dm
 
AI approaches in healthcare - targeting precise and personalized medicine
AI approaches in healthcare - targeting precise and personalized medicine AI approaches in healthcare - targeting precise and personalized medicine
AI approaches in healthcare - targeting precise and personalized medicine
 
Association mapping, GWAS, Mapping, natural population mapping
Association mapping, GWAS, Mapping, natural population mappingAssociation mapping, GWAS, Mapping, natural population mapping
Association mapping, GWAS, Mapping, natural population mapping
 
Esem2010 shihab
Esem2010 shihabEsem2010 shihab
Esem2010 shihab
 
Example 33.2 Principal Factor Analysis This example uses t.docx
Example 33.2 Principal Factor Analysis This example uses t.docxExample 33.2 Principal Factor Analysis This example uses t.docx
Example 33.2 Principal Factor Analysis This example uses t.docx
 
Linear Probability Models and Big Data: Prediction, Inference and Selection Bias
Linear Probability Models and Big Data: Prediction, Inference and Selection BiasLinear Probability Models and Big Data: Prediction, Inference and Selection Bias
Linear Probability Models and Big Data: Prediction, Inference and Selection Bias
 
A Hybrid Formulation between Differential Evolution and Simulated Annealing A...
A Hybrid Formulation between Differential Evolution and Simulated Annealing A...A Hybrid Formulation between Differential Evolution and Simulated Annealing A...
A Hybrid Formulation between Differential Evolution and Simulated Annealing A...
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
25 17 dec16 13743 28032-1-sm(edit)
25 17 dec16 13743 28032-1-sm(edit)25 17 dec16 13743 28032-1-sm(edit)
25 17 dec16 13743 28032-1-sm(edit)
 
Khamis 2015 - Comparative Assessment of Machine-Learning Scoring Functions on...
Khamis 2015 - Comparative Assessment of Machine-Learning Scoring Functions on...Khamis 2015 - Comparative Assessment of Machine-Learning Scoring Functions on...
Khamis 2015 - Comparative Assessment of Machine-Learning Scoring Functions on...
 
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDIMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
 
IRJET - Prediction of Risk Factor of the Patient with Hepatocellular Carcinom...
IRJET - Prediction of Risk Factor of the Patient with Hepatocellular Carcinom...IRJET - Prediction of Risk Factor of the Patient with Hepatocellular Carcinom...
IRJET - Prediction of Risk Factor of the Patient with Hepatocellular Carcinom...
 
Triangular Learner Model
Triangular Learner ModelTriangular Learner Model
Triangular Learner Model
 
Deep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdfDeep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdf
 
LRP for hand gesture recogntion.pptx
LRP for hand gesture recogntion.pptxLRP for hand gesture recogntion.pptx
LRP for hand gesture recogntion.pptx
 

More from Alireza Doustmohammadi

Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing Data
Alireza Doustmohammadi
 
Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
Alireza Doustmohammadi
 
OSPREY 3.0: Open-Source Protein Redesign for You
OSPREY 3.0: Open-Source Protein Redesign for YouOSPREY 3.0: Open-Source Protein Redesign for You
OSPREY 3.0: Open-Source Protein Redesign for You
Alireza Doustmohammadi
 
WGCNA: an R package for weighted correlation network analysis
WGCNA: an R package for weighted  correlation network analysisWGCNA: an R package for weighted  correlation network analysis
WGCNA: an R package for weighted correlation network analysis
Alireza Doustmohammadi
 
Introduction to Kaa IoT platform
Introduction to Kaa IoT platformIntroduction to Kaa IoT platform
Introduction to Kaa IoT platform
Alireza Doustmohammadi
 
Speech processing and the induction of spoken language
Speech processing and the induction of spoken languageSpeech processing and the induction of spoken language
Speech processing and the induction of spoken language
Alireza Doustmohammadi
 
Digital data storage technologies
Digital data storage technologiesDigital data storage technologies
Digital data storage technologies
Alireza Doustmohammadi
 
DevOps
DevOpsDevOps
differential expression genes (DEG)
differential expression genes (DEG)differential expression genes (DEG)
differential expression genes (DEG)
Alireza Doustmohammadi
 
Lowest common ancestor (LCA) algorithm
Lowest common ancestor (LCA) algorithmLowest common ancestor (LCA) algorithm
Lowest common ancestor (LCA) algorithm
Alireza Doustmohammadi
 

More from Alireza Doustmohammadi (10)

Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing Data
 
Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
 
OSPREY 3.0: Open-Source Protein Redesign for You
OSPREY 3.0: Open-Source Protein Redesign for YouOSPREY 3.0: Open-Source Protein Redesign for You
OSPREY 3.0: Open-Source Protein Redesign for You
 
WGCNA: an R package for weighted correlation network analysis
WGCNA: an R package for weighted  correlation network analysisWGCNA: an R package for weighted  correlation network analysis
WGCNA: an R package for weighted correlation network analysis
 
Introduction to Kaa IoT platform
Introduction to Kaa IoT platformIntroduction to Kaa IoT platform
Introduction to Kaa IoT platform
 
Speech processing and the induction of spoken language
Speech processing and the induction of spoken languageSpeech processing and the induction of spoken language
Speech processing and the induction of spoken language
 
Digital data storage technologies
Digital data storage technologiesDigital data storage technologies
Digital data storage technologies
 
DevOps
DevOpsDevOps
DevOps
 
differential expression genes (DEG)
differential expression genes (DEG)differential expression genes (DEG)
differential expression genes (DEG)
 
Lowest common ancestor (LCA) algorithm
Lowest common ancestor (LCA) algorithmLowest common ancestor (LCA) algorithm
Lowest common ancestor (LCA) algorithm
 

Recently uploaded

Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 

Recently uploaded (20)

Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 

Introduction to Applied Machine Learning

  • 1. 1 of 34 ML - 2021 Biological Sciences faculty Biophysics Department Introduction to Applied Machine Learning Presented By Alireza Doustmohammadi Graduate Student in Bioinformatics January 2021
  • 2. Contents Introduction Why do we need to prediction? Central Dogma of Prediction 2 of 34 ML - 2021
  • 4. Outline 4 of 34 Supervised Learning Unsupervised Learning Ensemble Learning ML - 2021
  • 5. Why do we need to prediction? 5 of 34 ML - 2021
  • 6. Why do we need to prediction? 6 of 34 Welcome To de Era of Big Data ….. ML - 2021
  • 7. Why do we need to prediction? 7 of 34 ML - 2021
  • 8. Why do we need to prediction? 8 of 34 [https://www.ncbi.nlm.nih.gov/genbank/statistics/] ML - 2021
  • 9. Why do we need to prediction? 9 of 34 [https://www.rcsb.org/stats/summary] Molecular Type X-ray NMR EM Multiple methods Neutron Other Total Protein (only) 135896 34576 4544 165 67 34 152280 Protein/NA 7177 269 1603 3 0 0 9052 Nucleic acid (only) 2158 1340 53 7 2 1 3561 Other 149 31 3 0 0 0 183 Total 153400 13453 6814 181 69 37 173754 PDB Data Distribution by Experimental Method and Molecular Type: ML - 2021
  • 10. Why do we need to prediction? 10 of 34 Basic Concepts & Nomenclatures ML - 2021
  • 11. Central Dogma of Prediction 11 of 34 ML - 2021
  • 12. Central Dogma of Prediction 12 of 34 ML - 2021
  • 13. Central Dogma of Prediction 13 of 34 • Defining the Questions • Data Collection • Feature Extraction • Preprocessing & Feature Selection • Algorithm (Classifier) • Evaluation • Redesign the Algorithm (Parameter Tuning) ML - 2021
  • 14. Features 14 of 34 ✓ Good representation of data ✓ Data Compression ✓ Need to expert’s knowledge ML - 2021
  • 15. Data Matrix 15 of 34 ML - 2021
  • 16. X , Y Relation 16 of 34 ML - 2021
  • 17. X , Y Relation 17 of 34 ML - 2021
  • 18. X , Y Relation 18 of 34 ML - 2021
  • 19. X , Y Relation 19 of 34 ML - 2021
  • 20. Reducible and irreducible error 20 of 34 Reducible Irreducible Goal: Minimizing the reducible error ML - 2021
  • 21. 21 of 34 Preprocessing & Feature Selection ML - 2021
  • 22. Preprocessing & Feature Selection 22 of 34 ML - 2021
  • 23. Preprocessing & Feature Selection 23 of 34 ML - 2021 Data Challenges: ➢ Miss Value ➢ Low-frequency variant Features ➢ Outliers
  • 24. Preprocessing & Feature Selection 24 of 34 ML - 2021 Blessing and Curse of Dimensionality
  • 26. Most Common Algorithms 26 of 34 KNN Regression MLP ANN SVM Decision Tree Random Forest Bayesian Net ML - 2021
  • 27. Central Dogma of Prediction 27 of 34 ML - 2021
  • 28. Algorithm (Model Selection): over fitting 28 of 34 ML - 2021 “All models are wrong, but some are useful.” George Box, British Statistician 1919-213
  • 29. Algorithm (Model Selection): over fitting 29 of 34 ML - 2021
  • 30. Algorithm (Model Selection): over fitting 30 of 34 Increasing the size of the data set may reduce the over-fitting ML - 2021
  • 31. Algorithm (Model Selection): over fitting 31 of 34 Bias – variance Trade off ML - 2021 Increase Flexibility: ▪ Bias tends to initially decrease faster than variance increases ▪ At some point has little impact on the bias but starts to significantly increase the variance.
  • 33. In Sample Error Vs on Sample error 33 of 34 ML - 2021 • Train data • Bias In sample error • Test data • Variance Out sample error
  • 34. The peaking paradox 34 of 34 ML - 2021