SlideShare a Scribd company logo
Linear
Regression
Linear approach for modelling the relationship between a
scalar dependent variable y and one or more explanatory
variables (or independent variables) x
Best fit line using Least Squares Regression
Decision
Tree
Used in classification problems with predefined target
variable. Decision tree is a tree in which each branch
node represents a choice between a number of
alternatives and each leaf node represents a decision.
Tree models where the target variable can take a
discrete set of values are called classification trees. In
these tree structures, leaves represent class labels and
branches represent conjunctions of features that lead
to those class labels.
K Nearest
Neighbours
Instance-based learning, or lazy learning,
function is only approximated locally and
all computation is deferred until
classification.
Used for classification & regression
Logistic
Regression
A classification model (class variable is
categorical)
It handles all types of relationships by
applying non-linear log transforms to the
predicted odds-ratio
Used for classification problems that are
binary such as pass/fail, fraud/genuine
Naïve Bayes
Naive Bayes classifiers are a family of simple
probabilistic classifiers based on applying Bayes'
theorem with strong (naive) independence
assumptions between the features.
It is used for Binary and Multiclass classification
problems
Principal
Component
Analysis
(PCA)
Statistical procedure that uses an orthogonal
transformation to convert a set of observations
of possibly correlated variables into a set of
values of linearly uncorrelated variables called
principal components.
PCA is mostly used as a tool in exploratory data
analysis and for making predictive models
K-Means
method of vector quantization, originally from
signal processing, that is popular for cluster
analysis in data mining. k-means clustering aims
to partition n observations into k clusters in
which each observation belongs to the cluster
with the nearest mean, serving as a prototype of
the cluster. This results in a partitioning of the
data space into Voronoi cells
Hierarchical
Clustering
hierarchical clustering is a method of cluster
analysis which seeks to build a hierarchy of
clusters. The merges and splits are determined
in a greedy manner. The results of hierarchical
clustering are usually presented in a
dendrogram.
Apriori
Algorithm
Used for frequent item set mining and association rule
learning over transactional databases.
It identifies the frequent individual items in the database
and extends them to larger and larger item sets as long as
those item sets appear sufficiently often in the database.
The frequent item sets determined by Apriori can be used
to determine association rules which highlight general
trends in the database.
Used in applications such as market basket analysis
FP-Tree
Association rule learning is a rule-based machine
learning method for discovering interesting relations
between variables in large databases
Random
Forest
They are ensemble learning method for
classification, regression. Random Forests grows
many classification or decision trees at training
time. The output of the decision trees are the
class which is either mode, or mean of the
predictions of the individual trees. Each tree
gives a classification, and we say the tree "votes"
for that class. The Random forest chooses the
classification having the most votes.
Supported
Vector
Machine
(SVM)
SVMs are based on the idea of finding a
hyperplane that best divides a dataset into two
classes.
SVMs are more commonly used in classification
problems such as SVM is used for text
classification tasks such as category
assignment, detecting spam and sentiment
analysis.

More Related Content

What's hot

Quality Metrics for Linked Open Data
Quality Metrics for  Linked Open Data Quality Metrics for  Linked Open Data
Quality Metrics for Linked Open Data
ebrahim_bagheri
 
DM
DMDM
DM
sowfi
 
3. chapter iii(aggregate data)
3. chapter iii(aggregate data)3. chapter iii(aggregate data)
3. chapter iii(aggregate data)
Chhom Karath
 
Data models
Data modelsData models
Data models
Hira Bukhari
 
Multiple discriminant analysis
Multiple discriminant analysisMultiple discriminant analysis
Multiple discriminant analysis
MUHAMMAD HASRATH
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
DataminingTools Inc
 
Data Structure - Elementary Data Organization
Data Structure - Elementary  Data Organization Data Structure - Elementary  Data Organization
Data Structure - Elementary Data Organization
Uma mohan
 
2018 p 2019-ee-a2
2018 p 2019-ee-a22018 p 2019-ee-a2
2018 p 2019-ee-a2
uetian12
 
Csrde discriminant analysis final
Csrde discriminant analysis finalCsrde discriminant analysis final
Csrde discriminant analysis final
Arkansas Tech University
 
Classification and regression trees (cart)
Classification and regression trees (cart)Classification and regression trees (cart)
Classification and regression trees (cart)
Learnbay Datascience
 
16 Simple CART
16 Simple CART16 Simple CART
16 Simple CART
Vishal Dutt
 
Dsa unit 1
Dsa unit 1Dsa unit 1
Dsa unit 1
ColorfullMedia
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
4 module 3 --
4 module 3 --4 module 3 --
4 module 3 --
tafosepsdfasg
 
discriminant analysis
discriminant analysisdiscriminant analysis
discriminant analysis
krishnadk
 
Phylogenetics: Tree building
Phylogenetics: Tree buildingPhylogenetics: Tree building
Adt
AdtAdt
Adt
MrSaem
 
Dynamic Data Validation Lists
Dynamic Data Validation ListsDynamic Data Validation Lists
Dynamic Data Validation Lists
Marc Rivait, PMP
 
3 module 2
3 module 23 module 2
3 module 2
tafosepsdfasg
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slides
tafosepsdfasg
 

What's hot (20)

Quality Metrics for Linked Open Data
Quality Metrics for  Linked Open Data Quality Metrics for  Linked Open Data
Quality Metrics for Linked Open Data
 
DM
DMDM
DM
 
3. chapter iii(aggregate data)
3. chapter iii(aggregate data)3. chapter iii(aggregate data)
3. chapter iii(aggregate data)
 
Data models
Data modelsData models
Data models
 
Multiple discriminant analysis
Multiple discriminant analysisMultiple discriminant analysis
Multiple discriminant analysis
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
 
Data Structure - Elementary Data Organization
Data Structure - Elementary  Data Organization Data Structure - Elementary  Data Organization
Data Structure - Elementary Data Organization
 
2018 p 2019-ee-a2
2018 p 2019-ee-a22018 p 2019-ee-a2
2018 p 2019-ee-a2
 
Csrde discriminant analysis final
Csrde discriminant analysis finalCsrde discriminant analysis final
Csrde discriminant analysis final
 
Classification and regression trees (cart)
Classification and regression trees (cart)Classification and regression trees (cart)
Classification and regression trees (cart)
 
16 Simple CART
16 Simple CART16 Simple CART
16 Simple CART
 
Dsa unit 1
Dsa unit 1Dsa unit 1
Dsa unit 1
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
4 module 3 --
4 module 3 --4 module 3 --
4 module 3 --
 
discriminant analysis
discriminant analysisdiscriminant analysis
discriminant analysis
 
Phylogenetics: Tree building
Phylogenetics: Tree buildingPhylogenetics: Tree building
Phylogenetics: Tree building
 
Adt
AdtAdt
Adt
 
Dynamic Data Validation Lists
Dynamic Data Validation ListsDynamic Data Validation Lists
Dynamic Data Validation Lists
 
3 module 2
3 module 23 module 2
3 module 2
 
2 introductory slides
2 introductory slides2 introductory slides
2 introductory slides
 

Similar to Machine Learning (simplified)

Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
DataminingTools Inc
 
Data mining: Classification and Prediction
Data mining: Classification and PredictionData mining: Classification and Prediction
Data mining: Classification and Prediction
Datamining Tools
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
DataminingTools Inc
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
guest0edcaf
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
Datamining Tools
 
WEKA: Output Knowledge Representation
WEKA: Output Knowledge RepresentationWEKA: Output Knowledge Representation
WEKA: Output Knowledge Representation
DataminingTools Inc
 
WEKA:Output Knowledge Representation
WEKA:Output Knowledge RepresentationWEKA:Output Knowledge Representation
WEKA:Output Knowledge Representation
weka Content
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
Sitamarhi Institute of Technology
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
Sitamarhi Institute of Technology
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
Sitamarhi Institute of Technology
 
Python Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdfPython Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdf
Avjinder (Avi) Kaler
 
Performance analysis of machine learning algorithms on self localization system1
Performance analysis of machine learning algorithms on self localization system1Performance analysis of machine learning algorithms on self localization system1
Performance analysis of machine learning algorithms on self localization system1
Venkat Projects
 
Data mining approaches and methods
Data mining approaches and methodsData mining approaches and methods
Data mining approaches and methods
sonangrai
 
Classifiers
ClassifiersClassifiers
Classifiers
Ayurdata
 
A Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of DiseasesA Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of Diseases
ijsrd.com
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin Analytics
Palin analytics
 
Regression ppt.pptx
Regression ppt.pptxRegression ppt.pptx
Regression ppt.pptx
DevendraSinghKaushal1
 
Performance analysis of machine learning algorithms on self localization system1
Performance analysis of machine learning algorithms on self localization system1Performance analysis of machine learning algorithms on self localization system1
Performance analysis of machine learning algorithms on self localization system1
Venkat Projects
 
13 random forest
13 random forest13 random forest
13 random forest
Vishal Dutt
 
Random forest
Random forestRandom forest
Random forest
Ujjawal
 

Similar to Machine Learning (simplified) (20)

Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Data mining: Classification and Prediction
Data mining: Classification and PredictionData mining: Classification and Prediction
Data mining: Classification and Prediction
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
WEKA: Output Knowledge Representation
WEKA: Output Knowledge RepresentationWEKA: Output Knowledge Representation
WEKA: Output Knowledge Representation
 
WEKA:Output Knowledge Representation
WEKA:Output Knowledge RepresentationWEKA:Output Knowledge Representation
WEKA:Output Knowledge Representation
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
Python Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdfPython Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdf
 
Performance analysis of machine learning algorithms on self localization system1
Performance analysis of machine learning algorithms on self localization system1Performance analysis of machine learning algorithms on self localization system1
Performance analysis of machine learning algorithms on self localization system1
 
Data mining approaches and methods
Data mining approaches and methodsData mining approaches and methods
Data mining approaches and methods
 
Classifiers
ClassifiersClassifiers
Classifiers
 
A Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of DiseasesA Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of Diseases
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin Analytics
 
Regression ppt.pptx
Regression ppt.pptxRegression ppt.pptx
Regression ppt.pptx
 
Performance analysis of machine learning algorithms on self localization system1
Performance analysis of machine learning algorithms on self localization system1Performance analysis of machine learning algorithms on self localization system1
Performance analysis of machine learning algorithms on self localization system1
 
13 random forest
13 random forest13 random forest
13 random forest
 
Random forest
Random forestRandom forest
Random forest
 

Recently uploaded

Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 

Recently uploaded (20)

Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 

Machine Learning (simplified)

  • 1. Linear Regression Linear approach for modelling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variables) x Best fit line using Least Squares Regression Decision Tree Used in classification problems with predefined target variable. Decision tree is a tree in which each branch node represents a choice between a number of alternatives and each leaf node represents a decision. Tree models where the target variable can take a discrete set of values are called classification trees. In these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels.
  • 2. K Nearest Neighbours Instance-based learning, or lazy learning, function is only approximated locally and all computation is deferred until classification. Used for classification & regression Logistic Regression A classification model (class variable is categorical) It handles all types of relationships by applying non-linear log transforms to the predicted odds-ratio Used for classification problems that are binary such as pass/fail, fraud/genuine
  • 3. Naïve Bayes Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features. It is used for Binary and Multiclass classification problems Principal Component Analysis (PCA) Statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. PCA is mostly used as a tool in exploratory data analysis and for making predictive models
  • 4. K-Means method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells Hierarchical Clustering hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. The merges and splits are determined in a greedy manner. The results of hierarchical clustering are usually presented in a dendrogram.
  • 5. Apriori Algorithm Used for frequent item set mining and association rule learning over transactional databases. It identifies the frequent individual items in the database and extends them to larger and larger item sets as long as those item sets appear sufficiently often in the database. The frequent item sets determined by Apriori can be used to determine association rules which highlight general trends in the database. Used in applications such as market basket analysis FP-Tree Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases
  • 6. Random Forest They are ensemble learning method for classification, regression. Random Forests grows many classification or decision trees at training time. The output of the decision trees are the class which is either mode, or mean of the predictions of the individual trees. Each tree gives a classification, and we say the tree "votes" for that class. The Random forest chooses the classification having the most votes. Supported Vector Machine (SVM) SVMs are based on the idea of finding a hyperplane that best divides a dataset into two classes. SVMs are more commonly used in classification problems such as SVM is used for text classification tasks such as category assignment, detecting spam and sentiment analysis.