SlideShare a Scribd company logo
DIFFERENT ALGORITHMS
USED IN
CLASSIFICATION
WHAT IS CLASSIFICATION
• Classification is used to identify the category of new observation on the basis of
training data
• Here the model learns from the given dataset and then classifies the new data
into a number of classes.
• Classification can be of two types:
• Binary Classification
• Multi-Class Classification
HOW CLASSIFICATION WORKS
• Classification is performed in following manner:
• At first, the training dataset(Labelled) is given to the Classifier (Classification
Algorithm).
• The Classifier analyses the dataset provided and recognizes the patter present in the
dataset.
• After that a suitable Classification model is chosen according to the task to be
performed.
• At last, the testing dataset (Labelled) is provided to model for classification to check
the performance of the model.
CLASSIFICATION ALGORITHMS
• Classification Algorithms are classified into two types
• Linear: Logistic Regression, Support Vector Machine (SVM)
• Non-Linear: K-nearest Neighbours(KNN), Kernel SVM, Decision Tree
• This division of Algorithms into Linear and Non-Linear is done on the basis of
dataset (Linear Separable or Non-Linearly Separable)
LOGISTIC REGRESSION
• This works on Categorical and Linearly Separable data.
• Goes best with Binary Classification.
• Uses Sigmoid Function (y=1/1+e^-x) for classification of data.
• Sigmoid Function: The Sigmoid function coverts the independent variable
into a expression of probability that ranges between [0,1] wrt to dependent
variable.
• The threshold value is considered as 0.5
• If the probability of the variable is >0.5 then it will be considered as 1 or that it
belongs to class A (upper class).
• If the probability of the variable is >0.5 then it will be considered as 0 or that it
belongs to class B (lower class)
SUPPORT VECTOR MACHINE (SVM)
• Works on Linearly Separable data.
• Classifies the data into different categories with the help of Decision Boundary.
• Decision Boundary (Hyperplane) is drawn in the graph(drawn by the analysis of
dataset) in such a manner that it divides the datapoints into two categories, in
order to decide the new dataset will belong to which class.
• The Margin for the hyperplane is calculated(which plays an important role in
deciding the hyperplane.
• Margin is the distance between the two Support Vectors.
• The points which were considered nearest to the hyperplane from both classes
and from which the line parallel to hyperplane is drawn are known as Support
Vectors.
• The hyperplane with maximum margin will be considered as the final hyperplane
(Maximal Margin Hyperplane – MMH)
K-NEAREST NEIGHBOURS (KNN)
• Works on Non-Linearly Separable data.
• K in KNN refers to the number of neighbours that should be considered.
• K number of datapoints with minimum distance from the datapoint to be
classified are considered.
• Euclidian Distance is used to calculate the distance between two datapoints
• Euclidian Distance d= (((|Xo1-Xa1|^2)+(|Xo2-Xa2|^2))^1/2)
• The classification is done on the basis of probability with respect to the classes
of those K neighbours.
• In the case of 50% probability, one more neighbour is taken into consideration.
KERNEL SUPPORT VECTOR MACHINE
• Works on Non-Linearly Separable data.
• Kernel SVM works with the help of Kernel function.
• Kernel Function is used to convert the low dimension
feature space to high dimensional feature space.
• The conversion of data from LDFS to HDFS helps to
draw the hyperplane in order to classify the data.
DECISION TREE
• Works on Non-Linearly Separable dataset.
• Tree Structured
• Contains two types of nodes:
• Decision Nodes
• Leaf Nodes
• Root Node is provided with the whole dataset from
where data splitting is done according to the
situation.
DIFFERENT ALGORITHMS
USED IN CLUSTERING
CLUSTERING
• Clustering is the task of dividing the population or data points into a number of groups
such that data points in the same groups are more similar to other data points in the
same group and dissimilar to the data points in other groups. It is basically a collection of
objects on the basis of similarity and dissimilarity between them.
WHAT ARE THE USES OF CLUSTERING?
Clustering has a myriad of uses in a variety of industries. Some common applications for clustering
include the following:
• Market Segmentation
•Statistical data analysis
•Social network analysis
•Image segmentation
•Anomaly detection, etc.
TYPES OF CLUSTERING METHODS
1.Partitioning Clustering
2.Density-Based Clustering
3.Distribution Model-Based Clustering
4.Hierarchical Clustering
PARTITIONING CLUSTERING
• It is a type of clustering that divides the data into non-hierarchical groups. It is
also known as the centroid-based method.
• The most common example of partitioning clustering is the K-Means Clustering
algorithm.
K- MEANS CLUSTERING ALGORITHM
• K-Means Clustering is an Unsupervised Learning algorithm, which
groups the unlabeled dataset into different clusters.
• It is an iterative algorithm that divides the unlabeled dataset into k
different clusters in such a way that each dataset belongs only one
group that has similar properties.
• It allows us to cluster the data into different groups and a
convenient way to discover the categories of groups in the
unlabeled dataset on its own without the need for any training.
DENSITY BASED CLUSTERING
• identify distinctive clusters in the data
• based on the idea that a cluster/group in a data space is a contiguous
region of high point density
• separated from other clusters by sparse regions
HIERARCHICAL CLUSTERING

More Related Content

Similar to Different Algorithms used in classification [Auto-saved].pptx

CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
Nandhini S
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
ArchanaT32
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
Yan Xu
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Maninda Edirisooriya
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
malathieswaran29
 
Unsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptxUnsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptx
jasontseng19
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
Nandakumar P
 
Unsupervised learning Modi.pptx
Unsupervised learning Modi.pptxUnsupervised learning Modi.pptx
Unsupervised learning Modi.pptx
ssusere1fd42
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
vikassingh569137
 
CLuster analysis presentation.pptx
CLuster analysis presentation.pptxCLuster analysis presentation.pptx
CLuster analysis presentation.pptx
SAJANVERMA4
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdf
bintis1
 
What is cluster analysis
What is cluster analysisWhat is cluster analysis
What is cluster analysis
Prabhat gangwar
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
AmAn Singh
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
nandhini manoharan
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
Sudhakar Chavan
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
sandeepsandy494692
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Maninda Edirisooriya
 
Data discretization
Data discretizationData discretization
Data discretization
Hadi M.Abachi
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Girish Khanzode
 
clustering and distance metrics.pptx
clustering and distance metrics.pptxclustering and distance metrics.pptx
clustering and distance metrics.pptx
ssuser2e437f
 

Similar to Different Algorithms used in classification [Auto-saved].pptx (20)

CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
 
Unsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptxUnsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptx
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
 
Unsupervised learning Modi.pptx
Unsupervised learning Modi.pptxUnsupervised learning Modi.pptx
Unsupervised learning Modi.pptx
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
CLuster analysis presentation.pptx
CLuster analysis presentation.pptxCLuster analysis presentation.pptx
CLuster analysis presentation.pptx
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdf
 
What is cluster analysis
What is cluster analysisWhat is cluster analysis
What is cluster analysis
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
Data discretization
Data discretizationData discretization
Data discretization
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
clustering and distance metrics.pptx
clustering and distance metrics.pptxclustering and distance metrics.pptx
clustering and distance metrics.pptx
 

Recently uploaded

EV Charging at Multifamily Properties by Kevin Donnelly
EV Charging at Multifamily Properties by Kevin DonnellyEV Charging at Multifamily Properties by Kevin Donnelly
EV Charging at Multifamily Properties by Kevin Donnelly
Forth
 
Catalytic Converter theft prevention - NYC.pptx
Catalytic Converter theft prevention - NYC.pptxCatalytic Converter theft prevention - NYC.pptx
Catalytic Converter theft prevention - NYC.pptx
Blue Star Brothers
 
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
78tq3hi2
 
EN Artificial Intelligence by Slidesgo.pptx
EN Artificial Intelligence by Slidesgo.pptxEN Artificial Intelligence by Slidesgo.pptx
EN Artificial Intelligence by Slidesgo.pptx
aichamardi99
 
Expanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Expanding Access to Affordable At-Home EV Charging by Vanessa WarheitExpanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Expanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Forth
 
Kaizen SMT_MI_PCBA for Quality Engineerspptx
Kaizen SMT_MI_PCBA for Quality EngineerspptxKaizen SMT_MI_PCBA for Quality Engineerspptx
Kaizen SMT_MI_PCBA for Quality Engineerspptx
vaibhavsrivastava482521
 
RACI Matrix Managed Services on Cloud 08-11-19_AS.pdf
RACI Matrix Managed Services on Cloud 08-11-19_AS.pdfRACI Matrix Managed Services on Cloud 08-11-19_AS.pdf
RACI Matrix Managed Services on Cloud 08-11-19_AS.pdf
xmasmen4u
 
Charging Fueling & Infrastructure (CFI) Program by Kevin Miller
Charging Fueling & Infrastructure (CFI) Program  by Kevin MillerCharging Fueling & Infrastructure (CFI) Program  by Kevin Miller
Charging Fueling & Infrastructure (CFI) Program by Kevin Miller
Forth
 
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
afkxen
 
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
78tq3hi2
 
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
afkxen
 
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
MarynaYurchenko2
 
Charging Fueling & Infrastructure (CFI) Program Resources by Cat Plein
Charging Fueling & Infrastructure (CFI) Program Resources by Cat PleinCharging Fueling & Infrastructure (CFI) Program Resources by Cat Plein
Charging Fueling & Infrastructure (CFI) Program Resources by Cat Plein
Forth
 
Here's Why Every Semi-Truck Should Have ELDs
Here's Why Every Semi-Truck Should Have ELDsHere's Why Every Semi-Truck Should Have ELDs
Here's Why Every Semi-Truck Should Have ELDs
jennifermiller8137
 
Hand Gesture Control Robotic Arm using image processing.pptx
Hand Gesture Control Robotic Arm using image processing.pptxHand Gesture Control Robotic Arm using image processing.pptx
Hand Gesture Control Robotic Arm using image processing.pptx
wstatus456
 
AadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects ( Asp Cranes ) RaipurAadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects
 
EV Charging at MFH Properties by Whitaker Jamieson
EV Charging at MFH Properties by Whitaker JamiesonEV Charging at MFH Properties by Whitaker Jamieson
EV Charging at MFH Properties by Whitaker Jamieson
Forth
 
MODULE ONE PRPC19 Design of Machine Elements- 1 .pdf
MODULE  ONE PRPC19 Design of Machine Elements- 1 .pdfMODULE  ONE PRPC19 Design of Machine Elements- 1 .pdf
MODULE ONE PRPC19 Design of Machine Elements- 1 .pdf
ShanthiniSellamuthu
 
Charging and Fueling Infrastructure Grant: Round 2 by Brandt Hertenstein
Charging and Fueling Infrastructure Grant: Round 2 by Brandt HertensteinCharging and Fueling Infrastructure Grant: Round 2 by Brandt Hertenstein
Charging and Fueling Infrastructure Grant: Round 2 by Brandt Hertenstein
Forth
 
原版制作(澳洲WSU毕业证书)西悉尼大学毕业证文凭证书一模一样
原版制作(澳洲WSU毕业证书)西悉尼大学毕业证文凭证书一模一样原版制作(澳洲WSU毕业证书)西悉尼大学毕业证文凭证书一模一样
原版制作(澳洲WSU毕业证书)西悉尼大学毕业证文凭证书一模一样
g1inbfro
 

Recently uploaded (20)

EV Charging at Multifamily Properties by Kevin Donnelly
EV Charging at Multifamily Properties by Kevin DonnellyEV Charging at Multifamily Properties by Kevin Donnelly
EV Charging at Multifamily Properties by Kevin Donnelly
 
Catalytic Converter theft prevention - NYC.pptx
Catalytic Converter theft prevention - NYC.pptxCatalytic Converter theft prevention - NYC.pptx
Catalytic Converter theft prevention - NYC.pptx
 
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
原版制作(Exeter毕业证书)埃克塞特大学毕业证完成信一模一样
 
EN Artificial Intelligence by Slidesgo.pptx
EN Artificial Intelligence by Slidesgo.pptxEN Artificial Intelligence by Slidesgo.pptx
EN Artificial Intelligence by Slidesgo.pptx
 
Expanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Expanding Access to Affordable At-Home EV Charging by Vanessa WarheitExpanding Access to Affordable At-Home EV Charging by Vanessa Warheit
Expanding Access to Affordable At-Home EV Charging by Vanessa Warheit
 
Kaizen SMT_MI_PCBA for Quality Engineerspptx
Kaizen SMT_MI_PCBA for Quality EngineerspptxKaizen SMT_MI_PCBA for Quality Engineerspptx
Kaizen SMT_MI_PCBA for Quality Engineerspptx
 
RACI Matrix Managed Services on Cloud 08-11-19_AS.pdf
RACI Matrix Managed Services on Cloud 08-11-19_AS.pdfRACI Matrix Managed Services on Cloud 08-11-19_AS.pdf
RACI Matrix Managed Services on Cloud 08-11-19_AS.pdf
 
Charging Fueling & Infrastructure (CFI) Program by Kevin Miller
Charging Fueling & Infrastructure (CFI) Program  by Kevin MillerCharging Fueling & Infrastructure (CFI) Program  by Kevin Miller
Charging Fueling & Infrastructure (CFI) Program by Kevin Miller
 
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
一比一原版(Columbia文凭证书)哥伦比亚大学毕业证如何办理
 
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
快速办理(napier毕业证书)英国龙比亚大学毕业证在读证明一模一样
 
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
一比一原版(WashU文凭证书)圣路易斯华盛顿大学毕业证如何办理
 
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
53286592-Global-Entrepreneurship-and-the-Successful-Growth-Strategies-of-Earl...
 
Charging Fueling & Infrastructure (CFI) Program Resources by Cat Plein
Charging Fueling & Infrastructure (CFI) Program Resources by Cat PleinCharging Fueling & Infrastructure (CFI) Program Resources by Cat Plein
Charging Fueling & Infrastructure (CFI) Program Resources by Cat Plein
 
Here's Why Every Semi-Truck Should Have ELDs
Here's Why Every Semi-Truck Should Have ELDsHere's Why Every Semi-Truck Should Have ELDs
Here's Why Every Semi-Truck Should Have ELDs
 
Hand Gesture Control Robotic Arm using image processing.pptx
Hand Gesture Control Robotic Arm using image processing.pptxHand Gesture Control Robotic Arm using image processing.pptx
Hand Gesture Control Robotic Arm using image processing.pptx
 
AadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects ( Asp Cranes ) RaipurAadiShakti Projects ( Asp Cranes ) Raipur
AadiShakti Projects ( Asp Cranes ) Raipur
 
EV Charging at MFH Properties by Whitaker Jamieson
EV Charging at MFH Properties by Whitaker JamiesonEV Charging at MFH Properties by Whitaker Jamieson
EV Charging at MFH Properties by Whitaker Jamieson
 
MODULE ONE PRPC19 Design of Machine Elements- 1 .pdf
MODULE  ONE PRPC19 Design of Machine Elements- 1 .pdfMODULE  ONE PRPC19 Design of Machine Elements- 1 .pdf
MODULE ONE PRPC19 Design of Machine Elements- 1 .pdf
 
Charging and Fueling Infrastructure Grant: Round 2 by Brandt Hertenstein
Charging and Fueling Infrastructure Grant: Round 2 by Brandt HertensteinCharging and Fueling Infrastructure Grant: Round 2 by Brandt Hertenstein
Charging and Fueling Infrastructure Grant: Round 2 by Brandt Hertenstein
 
原版制作(澳洲WSU毕业证书)西悉尼大学毕业证文凭证书一模一样
原版制作(澳洲WSU毕业证书)西悉尼大学毕业证文凭证书一模一样原版制作(澳洲WSU毕业证书)西悉尼大学毕业证文凭证书一模一样
原版制作(澳洲WSU毕业证书)西悉尼大学毕业证文凭证书一模一样
 

Different Algorithms used in classification [Auto-saved].pptx

  • 2. WHAT IS CLASSIFICATION • Classification is used to identify the category of new observation on the basis of training data • Here the model learns from the given dataset and then classifies the new data into a number of classes. • Classification can be of two types: • Binary Classification • Multi-Class Classification
  • 3. HOW CLASSIFICATION WORKS • Classification is performed in following manner: • At first, the training dataset(Labelled) is given to the Classifier (Classification Algorithm). • The Classifier analyses the dataset provided and recognizes the patter present in the dataset. • After that a suitable Classification model is chosen according to the task to be performed. • At last, the testing dataset (Labelled) is provided to model for classification to check the performance of the model.
  • 4. CLASSIFICATION ALGORITHMS • Classification Algorithms are classified into two types • Linear: Logistic Regression, Support Vector Machine (SVM) • Non-Linear: K-nearest Neighbours(KNN), Kernel SVM, Decision Tree • This division of Algorithms into Linear and Non-Linear is done on the basis of dataset (Linear Separable or Non-Linearly Separable)
  • 5. LOGISTIC REGRESSION • This works on Categorical and Linearly Separable data. • Goes best with Binary Classification. • Uses Sigmoid Function (y=1/1+e^-x) for classification of data. • Sigmoid Function: The Sigmoid function coverts the independent variable into a expression of probability that ranges between [0,1] wrt to dependent variable. • The threshold value is considered as 0.5 • If the probability of the variable is >0.5 then it will be considered as 1 or that it belongs to class A (upper class). • If the probability of the variable is >0.5 then it will be considered as 0 or that it belongs to class B (lower class)
  • 6. SUPPORT VECTOR MACHINE (SVM) • Works on Linearly Separable data. • Classifies the data into different categories with the help of Decision Boundary. • Decision Boundary (Hyperplane) is drawn in the graph(drawn by the analysis of dataset) in such a manner that it divides the datapoints into two categories, in order to decide the new dataset will belong to which class. • The Margin for the hyperplane is calculated(which plays an important role in deciding the hyperplane. • Margin is the distance between the two Support Vectors. • The points which were considered nearest to the hyperplane from both classes and from which the line parallel to hyperplane is drawn are known as Support Vectors. • The hyperplane with maximum margin will be considered as the final hyperplane (Maximal Margin Hyperplane – MMH)
  • 7. K-NEAREST NEIGHBOURS (KNN) • Works on Non-Linearly Separable data. • K in KNN refers to the number of neighbours that should be considered. • K number of datapoints with minimum distance from the datapoint to be classified are considered. • Euclidian Distance is used to calculate the distance between two datapoints • Euclidian Distance d= (((|Xo1-Xa1|^2)+(|Xo2-Xa2|^2))^1/2) • The classification is done on the basis of probability with respect to the classes of those K neighbours. • In the case of 50% probability, one more neighbour is taken into consideration.
  • 8. KERNEL SUPPORT VECTOR MACHINE • Works on Non-Linearly Separable data. • Kernel SVM works with the help of Kernel function. • Kernel Function is used to convert the low dimension feature space to high dimensional feature space. • The conversion of data from LDFS to HDFS helps to draw the hyperplane in order to classify the data.
  • 9. DECISION TREE • Works on Non-Linearly Separable dataset. • Tree Structured • Contains two types of nodes: • Decision Nodes • Leaf Nodes • Root Node is provided with the whole dataset from where data splitting is done according to the situation.
  • 11. CLUSTERING • Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group and dissimilar to the data points in other groups. It is basically a collection of objects on the basis of similarity and dissimilarity between them.
  • 12. WHAT ARE THE USES OF CLUSTERING? Clustering has a myriad of uses in a variety of industries. Some common applications for clustering include the following: • Market Segmentation •Statistical data analysis •Social network analysis •Image segmentation •Anomaly detection, etc.
  • 13. TYPES OF CLUSTERING METHODS 1.Partitioning Clustering 2.Density-Based Clustering 3.Distribution Model-Based Clustering 4.Hierarchical Clustering
  • 14. PARTITIONING CLUSTERING • It is a type of clustering that divides the data into non-hierarchical groups. It is also known as the centroid-based method. • The most common example of partitioning clustering is the K-Means Clustering algorithm.
  • 15. K- MEANS CLUSTERING ALGORITHM • K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into different clusters. • It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a way that each dataset belongs only one group that has similar properties. • It allows us to cluster the data into different groups and a convenient way to discover the categories of groups in the unlabeled dataset on its own without the need for any training.
  • 16. DENSITY BASED CLUSTERING • identify distinctive clusters in the data • based on the idea that a cluster/group in a data space is a contiguous region of high point density • separated from other clusters by sparse regions