SlideShare a Scribd company logo
1 of 16
©GoFlek, Inc
Abbas Taher
Founder & Product Manager
* Flek is the Norwegian word for Spot
Introduction to Supervised
Machine Learning
GoFlek*
Intelligent Data Mining
©GoFlek, Inc
Outline
I. Data Mining vs. Machine Learning
II. ML Techniques and Use Cases
III. Supervised ML Pipeline
IV. Supervised ML Model Training
©GoFlek, Inc
Math is sometimes called the
science of patterns
Roland Graham
Some ML techniques tell their reasoning
when predicting
Others read it from a crystal ball
©GoFlek, Inc
What is …
Data Mining
Machine
Learning
Nuggets &
Insights
Solving business problems through data
analysis. The process of discovering
patterns in data using ML methods and
tools amongst others.
Adaptive methods for uncovering the
“structural patterns” encapsulated
within the data
Hidden Patterns, Models, Predictions,
Statistics, Rules, Clusters, Classification,
Maps, Recurring Data, Outliers,
Groupings
©GoFlek, Inc
Data Mining Process
Data
Mining
ML ML ML
Process of
Discovering Patterns
ML used to find and
describe “structural
patterns” in the data
Infer
Patterns hidden in the
data that might provide
insight
Data
Structural
Patterns
computed
from the data
Compute
Decision
Tree
Vectors &
Matrixes
Graphs Neural
Networks
Rules
©GoFlek, Inc
What is Machine Learning
ML is the acquisition of
“structural patterns” from sample data
In a nutshell, ML is a set of adaptive methods to
discover those patterns. The methods used
have 2 core components:
 Algorithms for making computations
 Structures for storing the result models
©GoFlek, Inc
What makes a good
ML method?
 The structure (model) should be rich, flexible and
allow to be examined, reasoned about and queried for
information
 The model may be used directly or indirectly to make
predictions that are traceable and interpretable
 The computational algorithms should be based on
sound mathematical formulation
 The algorithms should be consistent – i.e generate the
same results with the same sample data
©GoFlek, Inc
Main ML Techniques
Classification k-nearest neighbors, decision tree, random
forest, rule induction, naive Bayes, Bayesian
networks, neural networks, support vector
machines, ensembles, boosted trees
Regression Linear, logistic, local, nonlinear, GLM
Clustering K-means, hierarchical, density, distribution
based, Latent Dirichlet Allocation (LDA)
Dimension Reduction Principal component analysis (PCA),
Linear Discriminant Analysis (LDA),
Singular Value Decomposition (SVD)
Outlier Detection Fraud, hacking, abnormal activity
Supervised
Learning
Model training
using labels
Unsupervised
Learning
Model training
without labels
©GoFlek, Inc
ML Use Cases
Supervised
Learning
Used for
predictions
Unsupervised
Learning
Used for grouping
or screening
Classification Recommendation, spam detection, market
basket analysis, online date matching, sales
opportunity scoring, equipment failure
prediction, churn, marketing leads generation,
document classification, sentiment analysis
Regression Drug response, stock returns, scenario
analysis, risk estimates, customer retention,
demand forecasting
Clustering Customer segmentation, topic modeling,
document grouping, sequence analysis, social
networks
Dimension Reduction Feature extraction & reduction, genetic
visualization, face recognition, model reduction,
correlation analysis, word co-occurrence, NLP
Outlier Detection Fraud, hacking, abnormal activity
©GoFlek, Inc
ML Classifiers & Techniques
k-nearest neighbors
decision tree
random forest
rule induction
naive Bayes
Bayesian networks
support vector machines
ensembles
boosted trees
neural networks
Linear regression
logistic regression
nonlinear regression
bagging
deep learning
association rules
probabilistic inference
logistic regression
influence graphs
©GoFlek, Inc
Supervised ML Flow
Sample
Data
Feature Matrix
Collect Labels
Selection & Extraction
Data Preparation Model Training & Tuning
©GoFlek, Inc
Supervised ML Flow
Sample
Data
Feature Matrix
Collect
Data Preparation
Labels
ML Algorithm
Multiple
Models
Models
Testing &
Validation
Best Classifier
ModelSelection & Extraction
Model Training & Tuning
©GoFlek, Inc
Supervised ML Flow
ML Algorithm
Multiple
Models
Best Classifier
Model
Predicted
Labels
New
Input
Data
ML Predictive Model
Data Preparation Model Training & Tuning
Models
Testing &
Validation
Prediction
Sample
Data
Feature Matrix
Collect Labels
Selection & Extraction
©GoFlek, Inc
Dataset
Train Model
Score Model
Tune &
Validate
Prepare Split
Supervised ML Flow
Test Model
Training
Data
Final
Model
Test
Data
Model
Template
Parameters
Select
©GoFlek, Inc
Choice of Classifier
Choice of classifier/model depends on:
A. Structural patterns needed to
understand and gain insights into
available data
B. Data type of target label:
 Categorical or yes/no: Discrete Classifiers
 Continuous: Linear Regression
 Probability estimates: naive Bayes
©GoFlek, Inc
Thank You
Q & A
abbas@goflek.com
Slide share/Abbas Taher

More Related Content

What's hot

Sis fri 1030 michael holmes
Sis fri 1030 michael holmesSis fri 1030 michael holmes
Sis fri 1030 michael holmesMediaPost
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsMathieu d'Aquin
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalitiesRajendran
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseKartik Kalpande Patil
 
Systems Analytics - present & future
Systems Analytics - present & futureSystems Analytics - present & future
Systems Analytics - present & futurePG Madhavan
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningSalford Systems
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningHoang Nguyen
 
Build Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsBuild Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsNeo4j
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Miningdataminers.ir
 
Graph Data Science Training - Alicia Frame Presentation
Graph Data Science Training - Alicia Frame PresentationGraph Data Science Training - Alicia Frame Presentation
Graph Data Science Training - Alicia Frame PresentationNeo4j
 
Data Mining – analyse Bank Marketing Data Set by WEKA.
Data Mining – analyse Bank Marketing Data Set by WEKA.Data Mining – analyse Bank Marketing Data Set by WEKA.
Data Mining – analyse Bank Marketing Data Set by WEKA.Mateusz Brzoska
 

What's hot (15)

Sis fri 1030 michael holmes
Sis fri 1030 michael holmesSis fri 1030 michael holmes
Sis fri 1030 michael holmes
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
Introduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in DatabaseIntroduction-to-Knowledge Discovery in Database
Introduction-to-Knowledge Discovery in Database
 
Systems Analytics - present & future
Systems Analytics - present & futureSystems Analytics - present & future
Systems Analytics - present & future
 
Classification of data
Classification of dataClassification of data
Classification of data
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
 
Datamining
DataminingDatamining
Datamining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Build Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and GraphsBuild Intelligent Fraud Prevention with Machine Learning and Graphs
Build Intelligent Fraud Prevention with Machine Learning and Graphs
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Graph Data Science Training - Alicia Frame Presentation
Graph Data Science Training - Alicia Frame PresentationGraph Data Science Training - Alicia Frame Presentation
Graph Data Science Training - Alicia Frame Presentation
 
Data Mining – analyse Bank Marketing Data Set by WEKA.
Data Mining – analyse Bank Marketing Data Set by WEKA.Data Mining – analyse Bank Marketing Data Set by WEKA.
Data Mining – analyse Bank Marketing Data Set by WEKA.
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 

Similar to Introduction to Supervised Machine Learning

GraphSummit Toronto: Leveraging Graphs for AI and ML
GraphSummit Toronto: Leveraging Graphs for AI and MLGraphSummit Toronto: Leveraging Graphs for AI and ML
GraphSummit Toronto: Leveraging Graphs for AI and MLNeo4j
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docxpriestmanmable
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docxsodhi3
 
Machine Learning for Lead Qualification
Machine Learning for Lead QualificationMachine Learning for Lead Qualification
Machine Learning for Lead QualificationRosanna Garcia
 
Scaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIEnterprise Knowledge
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
 
Learn How to Make Machine Learning Work
Learn How to Make Machine Learning WorkLearn How to Make Machine Learning Work
Learn How to Make Machine Learning WorkiTrainMalaysia1
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needGibDevs
 
Data Mining - The Big Picture!
Data Mining - The Big Picture!Data Mining - The Big Picture!
Data Mining - The Big Picture!Khalid Salama
 
Open Source Data Annotation Platform for NLP, CV, Tabular, and Log Data
Open Source Data Annotation Platform for NLP, CV, Tabular, and Log DataOpen Source Data Annotation Platform for NLP, CV, Tabular, and Log Data
Open Source Data Annotation Platform for NLP, CV, Tabular, and Log DataAll Things Open
 
Machine Learning: A Fast Review
Machine Learning: A Fast ReviewMachine Learning: A Fast Review
Machine Learning: A Fast ReviewAhmad Ali Abin
 

Similar to Introduction to Supervised Machine Learning (20)

GraphSummit Toronto: Leveraging Graphs for AI and ML
GraphSummit Toronto: Leveraging Graphs for AI and MLGraphSummit Toronto: Leveraging Graphs for AI and ML
GraphSummit Toronto: Leveraging Graphs for AI and ML
 
Talk
TalkTalk
Talk
 
Part1
Part1Part1
Part1
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx
 
6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx6711SafeAssign Originality Report69Total S.docx
6711SafeAssign Originality Report69Total S.docx
 
Machine Learning for Lead Qualification
Machine Learning for Lead QualificationMachine Learning for Lead Qualification
Machine Learning for Lead Qualification
 
Scaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AI
 
Machine learning
Machine learningMachine learning
Machine learning
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
 
Learn How to Make Machine Learning Work
Learn How to Make Machine Learning WorkLearn How to Make Machine Learning Work
Learn How to Make Machine Learning Work
 
Machine learning by ganesh kavhar
Machine learning by ganesh kavharMachine learning by ganesh kavhar
Machine learning by ganesh kavhar
 
Data mining
Data miningData mining
Data mining
 
demo AI ML.pptx
demo AI ML.pptxdemo AI ML.pptx
demo AI ML.pptx
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Mining - The Big Picture!
Data Mining - The Big Picture!Data Mining - The Big Picture!
Data Mining - The Big Picture!
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
 
Data mining
Data miningData mining
Data mining
 
Open Source Data Annotation Platform for NLP, CV, Tabular, and Log Data
Open Source Data Annotation Platform for NLP, CV, Tabular, and Log DataOpen Source Data Annotation Platform for NLP, CV, Tabular, and Log Data
Open Source Data Annotation Platform for NLP, CV, Tabular, and Log Data
 
Machine Learning: A Fast Review
Machine Learning: A Fast ReviewMachine Learning: A Fast Review
Machine Learning: A Fast Review
 

Recently uploaded

Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxnada99848
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 

Recently uploaded (20)

Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptx
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 

Introduction to Supervised Machine Learning

  • 1. ©GoFlek, Inc Abbas Taher Founder & Product Manager * Flek is the Norwegian word for Spot Introduction to Supervised Machine Learning GoFlek* Intelligent Data Mining
  • 2. ©GoFlek, Inc Outline I. Data Mining vs. Machine Learning II. ML Techniques and Use Cases III. Supervised ML Pipeline IV. Supervised ML Model Training
  • 3. ©GoFlek, Inc Math is sometimes called the science of patterns Roland Graham Some ML techniques tell their reasoning when predicting Others read it from a crystal ball
  • 4. ©GoFlek, Inc What is … Data Mining Machine Learning Nuggets & Insights Solving business problems through data analysis. The process of discovering patterns in data using ML methods and tools amongst others. Adaptive methods for uncovering the “structural patterns” encapsulated within the data Hidden Patterns, Models, Predictions, Statistics, Rules, Clusters, Classification, Maps, Recurring Data, Outliers, Groupings
  • 5. ©GoFlek, Inc Data Mining Process Data Mining ML ML ML Process of Discovering Patterns ML used to find and describe “structural patterns” in the data Infer Patterns hidden in the data that might provide insight Data Structural Patterns computed from the data Compute Decision Tree Vectors & Matrixes Graphs Neural Networks Rules
  • 6. ©GoFlek, Inc What is Machine Learning ML is the acquisition of “structural patterns” from sample data In a nutshell, ML is a set of adaptive methods to discover those patterns. The methods used have 2 core components:  Algorithms for making computations  Structures for storing the result models
  • 7. ©GoFlek, Inc What makes a good ML method?  The structure (model) should be rich, flexible and allow to be examined, reasoned about and queried for information  The model may be used directly or indirectly to make predictions that are traceable and interpretable  The computational algorithms should be based on sound mathematical formulation  The algorithms should be consistent – i.e generate the same results with the same sample data
  • 8. ©GoFlek, Inc Main ML Techniques Classification k-nearest neighbors, decision tree, random forest, rule induction, naive Bayes, Bayesian networks, neural networks, support vector machines, ensembles, boosted trees Regression Linear, logistic, local, nonlinear, GLM Clustering K-means, hierarchical, density, distribution based, Latent Dirichlet Allocation (LDA) Dimension Reduction Principal component analysis (PCA), Linear Discriminant Analysis (LDA), Singular Value Decomposition (SVD) Outlier Detection Fraud, hacking, abnormal activity Supervised Learning Model training using labels Unsupervised Learning Model training without labels
  • 9. ©GoFlek, Inc ML Use Cases Supervised Learning Used for predictions Unsupervised Learning Used for grouping or screening Classification Recommendation, spam detection, market basket analysis, online date matching, sales opportunity scoring, equipment failure prediction, churn, marketing leads generation, document classification, sentiment analysis Regression Drug response, stock returns, scenario analysis, risk estimates, customer retention, demand forecasting Clustering Customer segmentation, topic modeling, document grouping, sequence analysis, social networks Dimension Reduction Feature extraction & reduction, genetic visualization, face recognition, model reduction, correlation analysis, word co-occurrence, NLP Outlier Detection Fraud, hacking, abnormal activity
  • 10. ©GoFlek, Inc ML Classifiers & Techniques k-nearest neighbors decision tree random forest rule induction naive Bayes Bayesian networks support vector machines ensembles boosted trees neural networks Linear regression logistic regression nonlinear regression bagging deep learning association rules probabilistic inference logistic regression influence graphs
  • 11. ©GoFlek, Inc Supervised ML Flow Sample Data Feature Matrix Collect Labels Selection & Extraction Data Preparation Model Training & Tuning
  • 12. ©GoFlek, Inc Supervised ML Flow Sample Data Feature Matrix Collect Data Preparation Labels ML Algorithm Multiple Models Models Testing & Validation Best Classifier ModelSelection & Extraction Model Training & Tuning
  • 13. ©GoFlek, Inc Supervised ML Flow ML Algorithm Multiple Models Best Classifier Model Predicted Labels New Input Data ML Predictive Model Data Preparation Model Training & Tuning Models Testing & Validation Prediction Sample Data Feature Matrix Collect Labels Selection & Extraction
  • 14. ©GoFlek, Inc Dataset Train Model Score Model Tune & Validate Prepare Split Supervised ML Flow Test Model Training Data Final Model Test Data Model Template Parameters Select
  • 15. ©GoFlek, Inc Choice of Classifier Choice of classifier/model depends on: A. Structural patterns needed to understand and gain insights into available data B. Data type of target label:  Categorical or yes/no: Discrete Classifiers  Continuous: Linear Regression  Probability estimates: naive Bayes
  • 16. ©GoFlek, Inc Thank You Q & A abbas@goflek.com Slide share/Abbas Taher

Editor's Notes

  1. Flek is the Yiddish word for spot.