SlideShare a Scribd company logo
1 of 12
(CentreforKnowledgeTransfer)
institute
Cluster
ing
Dr. C.V. Suresh Babu
(CentreforKnowledgeTransfer)
institute
What is Clustering
• Clustering is the task of dividing the population or data points into a
number of groups such that data points in the same groups are more
similar to other data points in the same group and dissimilar to the
data points in other groups. It is basically a collection of objects on
the basis of similarity and dissimilarity between them.
(CentreforKnowledgeTransfer)
institute
• For ex– The data points in the graph below clustered together can be
classified into one single group. We can distinguish the clusters, and
we can identify that there are 3 clusters in the below picture.
(CentreforKnowledgeTransfer)
institute
• It is not necessary for clusters to be spherical. Such as :
DBSCAN: Density-based Spatial Clustering of
Applications with Noise
These data points are clustered by using the
basic concept that the data point lies within
the given constraint from the cluster center.
Various distance methods and techniques are
used for the calculation of the outliers.
(CentreforKnowledgeTransfer)
institute
Why Clustering?
• Clustering is very much important as it determines the intrinsic grouping
among the unlabelled data present.
• There are no criteria for good clustering.
• It depends on the user, what is the criteria they may use which satisfy their
need.
• For instance, we could be interested in finding representatives for
homogeneous groups (data reduction), in finding “natural clusters” and
describe their unknown properties (“natural” data types), in finding useful
and suitable groupings (“useful” data classes) or in finding unusual data
objects (outlier detection).
• This algorithm must make some assumptions that constitute the similarity
of points and each assumption make different and equally valid clusters.
(CentreforKnowledgeTransfer)
institute
Clustering Methods :
• Density-Based Methods
• Hierarchical Based Methods
• Partitioning Methods
• Grid-based Methods
(CentreforKnowledgeTransfer)
institute
Density-Based Methods
• These methods consider the clusters as the dense region having some
similarities and differences from the lower dense region of the space.
• These methods have good accuracy and the ability to merge two
clusters.
Example
• DBSCAN (Density-Based Spatial Clustering of Applications with Noise),
• OPTICS (Ordering Points to Identify Clustering Structure), etc.
(CentreforKnowledgeTransfer)
institute
Hierarchical Based Methods
• The clusters formed in this method form a tree-type structure based
on the hierarchy. New clusters are formed using the previously
formed one. It is divided into two category
• Agglomerative (bottom-up approach)
• Divisive (top-down approach)
Examples
• CURE (Clustering Using Representatives),
• BIRCH (Balanced Iterative Reducing Clustering and using Hierarchies)
(CentreforKnowledgeTransfer)
institute
Partitioning Methods
• These methods partition the objects into k clusters and each partition
forms one cluster.
• This method is used to optimize an objective criterion similarity
function such as when the distance is a major parameter
Example
• K-means,
• CLARANS (Clustering Large Applications based upon Randomized
Search)
(CentreforKnowledgeTransfer)
institute
Grid-based Methods
• In this method, the data space is formulated into a finite number of
cells that form a grid-like structure.
• All the clustering operations done on these grids are fast and
independent of the number of data objects
Example
• STING (Statistical Information Grid),
• wave cluster,
• CLIQUE (CLustering In Quest), etc.
(CentreforKnowledgeTransfer)
institute
Clustering Algorithms
• K-means clustering algorithm – It is the simplest unsupervised
learning algorithm that solves clustering problem.
• K-means algorithm partitions n observations into k clusters where
each observation belongs to the cluster with the nearest mean
serving as a prototype of the cluster.
(CentreforKnowledgeTransfer)
institute
Applications of Clustering in different fields
• Marketing: It can be used to characterize & discover customer segments
for marketing purposes.
• Biology: It can be used for classification among different species of plants
and animals.
• Libraries: It is used in clustering different books on the basis of topics and
information.
• Insurance: It is used to acknowledge the customers, their policies and
identifying the frauds.
• City Planning: It is used to make groups of houses and to study their values
based on their geographical locations and other factors present.
• Earthquake studies: By learning the earthquake-affected areas we can
determine the dangerous zones.

More Related Content

What's hot

1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalitiesKrish_ver2
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideSalford Systems
 
Classification and Clustering
Classification and ClusteringClassification and Clustering
Classification and ClusteringEng Teong Cheah
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data miningKrish_ver2
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Hierarchical clustering
Hierarchical clustering Hierarchical clustering
Hierarchical clustering Ashek Farabi
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithmhadifar
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning pyingkodi maran
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Data Reduction
Data ReductionData Reduction
Data ReductionRajan Shah
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine LearningSamra Shahzadi
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersFunctional Imperative
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series dataKrish_ver2
 

What's hot (20)

K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
 
Classification and Clustering
Classification and ClusteringClassification and Clustering
Classification and Clustering
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Hierarchical clustering
Hierarchical clustering Hierarchical clustering
Hierarchical clustering
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Reduction
Data ReductionData Reduction
Data Reduction
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 

Similar to Clustering

clustering and distance metrics.pptx
clustering and distance metrics.pptxclustering and distance metrics.pptx
clustering and distance metrics.pptxssuser2e437f
 
UNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data MiningUNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data MiningNandakumar P
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfSowmyaJyothi3
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)Pravinkumar Landge
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfigeabroad
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.pptvikassingh569137
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3Nandhini S
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...IJCSIS Research Publications
 
Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...IRJET Journal
 
computational statistics machine learning unit 5.pptx
computational statistics machine learning unit 5.pptxcomputational statistics machine learning unit 5.pptx
computational statistics machine learning unit 5.pptxAnubhavKushagra
 

Similar to Clustering (20)

clustering and distance metrics.pptx
clustering and distance metrics.pptxclustering and distance metrics.pptx
clustering and distance metrics.pptx
 
Data mining
Data miningData mining
Data mining
 
UNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data MiningUNIT - 4: Data Warehousing and Data Mining
UNIT - 4: Data Warehousing and Data Mining
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Ir3116271633
Ir3116271633Ir3116271633
Ir3116271633
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
Rohit 10103543
Rohit 10103543Rohit 10103543
Rohit 10103543
 
cluster.pptx
cluster.pptxcluster.pptx
cluster.pptx
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdf
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
 
Dp33701704
Dp33701704Dp33701704
Dp33701704
 
Dp33701704
Dp33701704Dp33701704
Dp33701704
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...
 
Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...Cancer data partitioning with data structure and difficulty independent clust...
Cancer data partitioning with data structure and difficulty independent clust...
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
Clustering on DSS
Clustering on DSSClustering on DSS
Clustering on DSS
 
computational statistics machine learning unit 5.pptx
computational statistics machine learning unit 5.pptxcomputational statistics machine learning unit 5.pptx
computational statistics machine learning unit 5.pptx
 

More from Dr. C.V. Suresh Babu

Diagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AIDiagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AIDr. C.V. Suresh Babu
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”Dr. C.V. Suresh Babu
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”Dr. C.V. Suresh Babu
 
A study on “the impact of data analytics in covid 19 health care system”
A study on “the impact of data analytics in covid 19 health care system”A study on “the impact of data analytics in covid 19 health care system”
A study on “the impact of data analytics in covid 19 health care system”Dr. C.V. Suresh Babu
 

More from Dr. C.V. Suresh Babu (20)

Data analytics with R
Data analytics with RData analytics with R
Data analytics with R
 
Association rules
Association rulesAssociation rules
Association rules
 
Classification
ClassificationClassification
Classification
 
Blue property assumptions.
Blue property assumptions.Blue property assumptions.
Blue property assumptions.
 
Introduction to regression
Introduction to regressionIntroduction to regression
Introduction to regression
 
DART
DARTDART
DART
 
Mycin
MycinMycin
Mycin
 
Expert systems
Expert systemsExpert systems
Expert systems
 
Dempster shafer theory
Dempster shafer theoryDempster shafer theory
Dempster shafer theory
 
Bayes network
Bayes networkBayes network
Bayes network
 
Bayes' theorem
Bayes' theoremBayes' theorem
Bayes' theorem
 
Knowledge based agents
Knowledge based agentsKnowledge based agents
Knowledge based agents
 
Rule based system
Rule based systemRule based system
Rule based system
 
Formal Logic in AI
Formal Logic in AIFormal Logic in AI
Formal Logic in AI
 
Production based system
Production based systemProduction based system
Production based system
 
Game playing in AI
Game playing in AIGame playing in AI
Game playing in AI
 
Diagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AIDiagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AI
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
 
A study on “the impact of data analytics in covid 19 health care system”
A study on “the impact of data analytics in covid 19 health care system”A study on “the impact of data analytics in covid 19 health care system”
A study on “the impact of data analytics in covid 19 health care system”
 

Recently uploaded

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 

Recently uploaded (20)

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 

Clustering

  • 2. (CentreforKnowledgeTransfer) institute What is Clustering • Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group and dissimilar to the data points in other groups. It is basically a collection of objects on the basis of similarity and dissimilarity between them.
  • 3. (CentreforKnowledgeTransfer) institute • For ex– The data points in the graph below clustered together can be classified into one single group. We can distinguish the clusters, and we can identify that there are 3 clusters in the below picture.
  • 4. (CentreforKnowledgeTransfer) institute • It is not necessary for clusters to be spherical. Such as : DBSCAN: Density-based Spatial Clustering of Applications with Noise These data points are clustered by using the basic concept that the data point lies within the given constraint from the cluster center. Various distance methods and techniques are used for the calculation of the outliers.
  • 5. (CentreforKnowledgeTransfer) institute Why Clustering? • Clustering is very much important as it determines the intrinsic grouping among the unlabelled data present. • There are no criteria for good clustering. • It depends on the user, what is the criteria they may use which satisfy their need. • For instance, we could be interested in finding representatives for homogeneous groups (data reduction), in finding “natural clusters” and describe their unknown properties (“natural” data types), in finding useful and suitable groupings (“useful” data classes) or in finding unusual data objects (outlier detection). • This algorithm must make some assumptions that constitute the similarity of points and each assumption make different and equally valid clusters.
  • 6. (CentreforKnowledgeTransfer) institute Clustering Methods : • Density-Based Methods • Hierarchical Based Methods • Partitioning Methods • Grid-based Methods
  • 7. (CentreforKnowledgeTransfer) institute Density-Based Methods • These methods consider the clusters as the dense region having some similarities and differences from the lower dense region of the space. • These methods have good accuracy and the ability to merge two clusters. Example • DBSCAN (Density-Based Spatial Clustering of Applications with Noise), • OPTICS (Ordering Points to Identify Clustering Structure), etc.
  • 8. (CentreforKnowledgeTransfer) institute Hierarchical Based Methods • The clusters formed in this method form a tree-type structure based on the hierarchy. New clusters are formed using the previously formed one. It is divided into two category • Agglomerative (bottom-up approach) • Divisive (top-down approach) Examples • CURE (Clustering Using Representatives), • BIRCH (Balanced Iterative Reducing Clustering and using Hierarchies)
  • 9. (CentreforKnowledgeTransfer) institute Partitioning Methods • These methods partition the objects into k clusters and each partition forms one cluster. • This method is used to optimize an objective criterion similarity function such as when the distance is a major parameter Example • K-means, • CLARANS (Clustering Large Applications based upon Randomized Search)
  • 10. (CentreforKnowledgeTransfer) institute Grid-based Methods • In this method, the data space is formulated into a finite number of cells that form a grid-like structure. • All the clustering operations done on these grids are fast and independent of the number of data objects Example • STING (Statistical Information Grid), • wave cluster, • CLIQUE (CLustering In Quest), etc.
  • 11. (CentreforKnowledgeTransfer) institute Clustering Algorithms • K-means clustering algorithm – It is the simplest unsupervised learning algorithm that solves clustering problem. • K-means algorithm partitions n observations into k clusters where each observation belongs to the cluster with the nearest mean serving as a prototype of the cluster.
  • 12. (CentreforKnowledgeTransfer) institute Applications of Clustering in different fields • Marketing: It can be used to characterize & discover customer segments for marketing purposes. • Biology: It can be used for classification among different species of plants and animals. • Libraries: It is used in clustering different books on the basis of topics and information. • Insurance: It is used to acknowledge the customers, their policies and identifying the frauds. • City Planning: It is used to make groups of houses and to study their values based on their geographical locations and other factors present. • Earthquake studies: By learning the earthquake-affected areas we can determine the dangerous zones.