SlideShare a Scribd company logo
1 of 14
Cluster
analysis
Presented By:-
Shubham Goyal
•What is clusteranalysis?What is clusteranalysis?
•Types of data in clusteranalysisTypes of data in clusteranalysis
•Majorclustering methodsMajorclustering methods
•SummarySummary
ClusteranalysisClusteranalysis
Cluster:Cluster: a collection of data
objects
osimilar to one another within the
same cluster
odissimilar to the objects in the
other clusters
Aimof clustering:Aimof clustering: to group a set
of data objects into clusters
What is clusteranalysis?What is clusteranalysis?
APPLICATIONS OFCLUSTERINGAPPLICATIONS OFCLUSTERING
Marketing:Marketing: discovering of distinct customer
groups in a purchase database
Land use:Land use: identifying of areas of similar land use
in an earth observation database
Insurance:Insurance: identifying groups of motor insurance
policy holders with a high average claim cost
City-planning:City-planning: identifying groups of houses
according to their house type, value, and
geographical location
TYPEOFDATA IN CLUSTERTYPEOFDATA IN CLUSTER
ANALYSISANALYSIS
•Interval-scaled variablesInterval-scaled variables
•Binary variablesBinary variables
•OrdinalOrdinal
•RRatio variablesatio variables
•Complex data typesComplex data types
MAJORCLUSTERINGMAJORCLUSTERING
METHODSMETHODS
•Partitioning methods-
•K-means methodsK-means methods
•Hierarchical methodsHierarchical methods
K-MEANS CLUSTERINGK-MEANS CLUSTERING
METHODMETHOD
Input to the algorithmInput to the algorithm: the number of clusters k,
and a database of n objects
Algorithmconsists of fourstepsAlgorithmconsists of foursteps:
1. partition object into k nonempty subsets/clusters
2. compute a seed points as the centroidcentroid (the
mean of the objects in the cluster) for each
cluster in the current partition
3. assign each object to the cluster with the nearest
centroid
4. go back to Step 2, stop when there are no more
new assignments
K-MEANS CLUSTERINGK-MEANS CLUSTERING
METHOD- EXAMPLEMETHOD- EXAMPLE
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
Complexities of K-means
 
Time Complexity
 Let tdist be the time to calculate the distance between two
objects
 Each iteration time complexity:
O(Kn tdist)
K = number of clusters (centroids)
n = number of objects
 Bound number of iterations I giving
O(I Kn tdist)
 
Space Complexity
For m-dimensional vectors
Strength of the k-Means
Clustering
•   Relatively efficient: O (t k n), where n is number of
objects,
k is number of clusters, and t is number of iterations.
Normally k, t << n.
• K-Means may produce tighter clusters than
hierarchical
clustering.
 
Weakness of the k-means
Clustering
• Applicable only when mean is defined (works only for
numerical observations), then what about categorical
data?
• Need to specify k, the number of clusters, in advance.
• Unable to handle noisy data and outliers
•Clusteranalysis groups objectsClusteranalysis groups objects
based on theirsimilaritybased on theirsimilarity
•Clusteranalysis has wideClusteranalysis has wide
applicationsapplications
•Measure of similarity can beMeasure of similarity can be
computed forvarious type ofcomputed forvarious type of
datadata
•Selection of similarity measureSelection of similarity measure
is dependent on the data usedis dependent on the data used
and the type of similarity weand the type of similarity we
are searching forare searching for
SummarySummary
REFERENCES - CLUSTERINGREFERENCES - CLUSTERING
•R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of
high dimensional data for data mining applications. SIGMOD'98
• Ms.Avita Katal , Assistant professor ,Dept. of CS/IT , Graphic Era Hill University.
Cluster analysis

More Related Content

What's hot

3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clusteringKrish_ver2
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithmhadifar
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clusteringKrish_ver2
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsPrashanth Guntal
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysissaba khan
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysisguest0edcaf
 
Chapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningChapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningHouw Liong The
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysiss v
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysisAcad
 
Chap8 basic cluster_analysis
Chap8 basic cluster_analysisChap8 basic cluster_analysis
Chap8 basic cluster_analysisguru_prasadg
 
Big data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesBig data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesFarzad Nozarian
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 

What's hot (20)

3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
 
Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
Clustering
ClusteringClustering
Clustering
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clustering
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Clustering
ClusteringClustering
Clustering
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithms
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 
Cluster
ClusterCluster
Cluster
 
Chapter8
Chapter8Chapter8
Chapter8
 
Chapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningChapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text mining
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Chap8 basic cluster_analysis
Chap8 basic cluster_analysisChap8 basic cluster_analysis
Chap8 basic cluster_analysis
 
Big data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesBig data Clustering Algorithms And Strategies
Big data Clustering Algorithms And Strategies
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 

Viewers also liked

Viewers also liked (15)

Clustering
ClusteringClustering
Clustering
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
Cluster analysis for market segmentation
Cluster analysis for market segmentationCluster analysis for market segmentation
Cluster analysis for market segmentation
 
Clustering training
Clustering trainingClustering training
Clustering training
 
08 clustering
08 clustering08 clustering
08 clustering
 
Cluster analysis in prespective to Marketing Research
Cluster analysis in prespective to Marketing ResearchCluster analysis in prespective to Marketing Research
Cluster analysis in prespective to Marketing Research
 
Market basket analysis
Market basket analysisMarket basket analysis
Market basket analysis
 
Masket Basket Analysis
Masket Basket AnalysisMasket Basket Analysis
Masket Basket Analysis
 
Market basket analysis
Market basket analysisMarket basket analysis
Market basket analysis
 
Market Basket Analysis
Market Basket AnalysisMarket Basket Analysis
Market Basket Analysis
 
Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples
 
Cutis1
Cutis1Cutis1
Cutis1
 

Similar to Cluster analysis

26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.pptvikassingh569137
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3Nandhini S
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfigeabroad
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit vmalathieswaran29
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsNithyananthSengottai
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in RSudhakar Chavan
 
Unsupervised learning Modi.pptx
Unsupervised learning Modi.pptxUnsupervised learning Modi.pptx
Unsupervised learning Modi.pptxssusere1fd42
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdfbintis1
 
Cluster_saumitra.ppt
Cluster_saumitra.pptCluster_saumitra.ppt
Cluster_saumitra.pptssuser6b3336
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfSowmyaJyothi3
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE IJORCS
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)Pravinkumar Landge
 

Similar to Cluster analysis (20)

26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
cluster analysis
cluster analysiscluster analysis
cluster analysis
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
 
Clustering.pdf
Clustering.pdfClustering.pdf
Clustering.pdf
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
Clusteryanam
ClusteryanamClusteryanam
Clusteryanam
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Clustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdfClustering[306] [Read-Only].pdf
Clustering[306] [Read-Only].pdf
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
 
Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 
Clustering.pdf
Clustering.pdfClustering.pdf
Clustering.pdf
 
Unsupervised learning Modi.pptx
Unsupervised learning Modi.pptxUnsupervised learning Modi.pptx
Unsupervised learning Modi.pptx
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdf
 
Cluster_saumitra.ppt
Cluster_saumitra.pptCluster_saumitra.ppt
Cluster_saumitra.ppt
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE
QUALITY OF CLUSTER INDEX BASED ON STUDY OF DECISION TREE
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 

Recently uploaded

costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Cluster analysis

  • 2. •What is clusteranalysis?What is clusteranalysis? •Types of data in clusteranalysisTypes of data in clusteranalysis •Majorclustering methodsMajorclustering methods •SummarySummary ClusteranalysisClusteranalysis
  • 3. Cluster:Cluster: a collection of data objects osimilar to one another within the same cluster odissimilar to the objects in the other clusters Aimof clustering:Aimof clustering: to group a set of data objects into clusters What is clusteranalysis?What is clusteranalysis?
  • 4. APPLICATIONS OFCLUSTERINGAPPLICATIONS OFCLUSTERING Marketing:Marketing: discovering of distinct customer groups in a purchase database Land use:Land use: identifying of areas of similar land use in an earth observation database Insurance:Insurance: identifying groups of motor insurance policy holders with a high average claim cost City-planning:City-planning: identifying groups of houses according to their house type, value, and geographical location
  • 5. TYPEOFDATA IN CLUSTERTYPEOFDATA IN CLUSTER ANALYSISANALYSIS •Interval-scaled variablesInterval-scaled variables •Binary variablesBinary variables •OrdinalOrdinal •RRatio variablesatio variables •Complex data typesComplex data types
  • 7. K-MEANS CLUSTERINGK-MEANS CLUSTERING METHODMETHOD Input to the algorithmInput to the algorithm: the number of clusters k, and a database of n objects Algorithmconsists of fourstepsAlgorithmconsists of foursteps: 1. partition object into k nonempty subsets/clusters 2. compute a seed points as the centroidcentroid (the mean of the objects in the cluster) for each cluster in the current partition 3. assign each object to the cluster with the nearest centroid 4. go back to Step 2, stop when there are no more new assignments
  • 8. K-MEANS CLUSTERINGK-MEANS CLUSTERING METHOD- EXAMPLEMETHOD- EXAMPLE 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
  • 9. Complexities of K-means   Time Complexity  Let tdist be the time to calculate the distance between two objects  Each iteration time complexity: O(Kn tdist) K = number of clusters (centroids) n = number of objects  Bound number of iterations I giving O(I Kn tdist)   Space Complexity For m-dimensional vectors
  • 10. Strength of the k-Means Clustering •   Relatively efficient: O (t k n), where n is number of objects, k is number of clusters, and t is number of iterations. Normally k, t << n. • K-Means may produce tighter clusters than hierarchical clustering.  
  • 11. Weakness of the k-means Clustering • Applicable only when mean is defined (works only for numerical observations), then what about categorical data? • Need to specify k, the number of clusters, in advance. • Unable to handle noisy data and outliers
  • 12. •Clusteranalysis groups objectsClusteranalysis groups objects based on theirsimilaritybased on theirsimilarity •Clusteranalysis has wideClusteranalysis has wide applicationsapplications •Measure of similarity can beMeasure of similarity can be computed forvarious type ofcomputed forvarious type of datadata •Selection of similarity measureSelection of similarity measure is dependent on the data usedis dependent on the data used and the type of similarity weand the type of similarity we are searching forare searching for SummarySummary
  • 13. REFERENCES - CLUSTERINGREFERENCES - CLUSTERING •R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. SIGMOD'98 • Ms.Avita Katal , Assistant professor ,Dept. of CS/IT , Graphic Era Hill University.