SlideShare a Scribd company logo
1 of 14
 While partitioning methods meet the basic clustering
requirement of organizing a set of objects into a number of
exclusive groups, in some situations we may want to partition
our data into groups at different levels such as in a hierarchy.
 A hierarchical clustering method works by grouping data objects
into a hierarchy or “tree” of clusters.
 Representing data objects in the form of a hierarchy is usefull
for data summarization and visualization.
 The multiple-phase (or multiphase) clustering.
I. Agglomerative versus Divisive Hierarchical Clustering:
 A hierarchical clustering method can be either agglomerative or divisive,
depending on whether the hierarchical decomposition is formed in a
bottom-up or top-down fashion.
 An agglomerative hierarchical clustering method uses a bottom-up
strategy. The single cluster becomes the hierarchy`s root.
 A divisive hierarchical clustering method employs a top-down strategy. It
starts by placing all objects in one cluster, which is the hierarchy`s root.
 This is a single-linkage approach in that each cluster is represented by all
the objects in the cluster, and the similarity between two clusters is
measured by the similarity of the closest pair of data points belonging to
different cluster.
 The cluster-splitting process repeats until, eventually, each new cluster
contains only a single objects.
 A tree structure called a dendrogram is commonly used to represent the
process of hierarchical clustering.
II. Distance Measures in Algorithmic Methods:
 whether using an agglomerative method or a divisive method, a core need
is to measure the distance between two clusters, where each cluster is
generally a set of objects.
 Four widely used measures for distance between clusters, where |p-p`| is
the distance between two objects or points, p and p`; mi is the mean for
cluster, Ci and ni is the number of objects in Ci.
 They are also known as linkage measures.
 Minimum distance: distmin(Ci, Cj)= min {|p-p`|}
pCᵢ;p`Cj
 Maximum distance: distmax(Ci, Cj)= min {|p-p`|}
pCᵢ;p`Cj
 Mean distance: distmin(Ci, Cj)=  Mᵢ- Mj
 Average distance: distmin(Ci, Cj)= 1/ nᵢnj p - p`
pCᵢ;p`Cj
III. BIRCH: Multiphase Hierarchical Clustering Using
Clustering Feature Trees
 Balanced Iterative Reducing and Clustering using Hierarchical (BIRCH)
is designed for clustering a large amount of numeric data by integrating
hierarchical clustering(at the initial micro clustering stage) and other
clustering methods such as iterative partitioning(at the macro clustering
stage).
 The two difficulties in agglomerative clustering methods:
Scalability
Inability
 BIRCH uses the notions of clustering feature to summarize a cluster, and
clustering feature tree(CF-tree) to represent a cluster hierarchy.
 Consider a cluster of n d-dimensional data objects or points.
 A clustering feature is essentially a summary of the statistics for the given
cluster. The cluster`s centroid, xₒ, radius, R, and diameter, D, are
 Phase 1:
BIRCH scans the database to build an initial in-memory CF-
tree, which can be viewed as a multilevel compression of the
data that tries to preserve the data`s inherent clustering structure.
 Phase 2:
BIRCH applies a (selected) clustering algorithm to cluster the
leaf nodes of the CF-tree, which removes sparse clusters and
groups dense cluster into larger ones.
IV. Chameleon: Multiphase Hierarchical Clustering Using
Dynamic Modeling
 Chameleon is a hierarchical clustering algorithm that uses dynamic modeling
to determine the similarity between pairs of clusters.
 The connected objects are within a clusters and the proximity of clusters.
 That is, two clusters are merged if their interconnectivity high and they are
close together.
 Chameleon uses a k-nearest-neighbor graph approach to construct a sparse
graph.
 Chameleon uses a graph partitioning algorithm to partition the
k-nearest-neighbor graph into a large number of relatively small
subclusters such that it minimizes the edge cut.
 Chameleon determines the similarity between each pair of
clusters Cᵢ and Cj according to their relative interconnectivity,
RI(Cᵢ,Cj), and their relative closeness, RC(Ci,Cj).
 The processing cost for high-dimensional data may require
O(n²) time for n objects in the worst case.
V. Probabilistic Hierarchical Clustering
 Algorithmic hierarchical clustering methods using linkage measures tend
to be easy to understand and are often efficient in clustering.
 They are commonly used in many clustering analysis applications.
 Algorithmic hierarchical clustering methods can suffer from several
drawbacks.
 One way to look at the clustering problem is to regard the set of data
objects to be clustered as sample of the underlying data generation
mechanism to be analyzed or, formally, the generative model.
Hierarchical Clustering Methods and Approaches
Hierarchical Clustering Methods and Approaches

More Related Content

What's hot

Density based methods
Density based methodsDensity based methods
Density based methodsSVijaylakshmi
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methodsKrish_ver2
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDatamining Tools
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsJustin Cletus
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer ArchitecturePankaj Kumar Jain
 
K means clustering
K means clusteringK means clustering
K means clusteringkeshav goyal
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretizationKrish_ver2
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based ClusteringSSA KPI
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDataminingTools Inc
 
All data models in dbms
All data models in dbmsAll data models in dbms
All data models in dbmsNaresh Kumar
 

What's hot (20)

Density based methods
Density based methodsDensity based methods
Density based methods
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Clustering
ClusteringClustering
Clustering
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
Data discretization
Data discretizationData discretization
Data discretization
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer Architecture
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Data clustering
Data clustering Data clustering
Data clustering
 
1.8 discretization
1.8 discretization1.8 discretization
1.8 discretization
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based Clustering
 
DBMS OF DATA MODEL Deepika 2
DBMS OF DATA MODEL  Deepika 2DBMS OF DATA MODEL  Deepika 2
DBMS OF DATA MODEL Deepika 2
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
Pram model
Pram modelPram model
Pram model
 
All data models in dbms
All data models in dbmsAll data models in dbms
All data models in dbms
 

Similar to Hierarchical Clustering Methods and Approaches

clustering ppt.pptx
clustering ppt.pptxclustering ppt.pptx
clustering ppt.pptxchmeghana1
 
Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478IJRAT
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDatamining Tools
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxK- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxSaiPragnaKancheti
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxK- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxSaiPragnaKancheti
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfSowmyaJyothi3
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster AnalysisDerek Kane
 
Unsupervised Learning.pptx
Unsupervised Learning.pptxUnsupervised Learning.pptx
Unsupervised Learning.pptxGandhiMathy6
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul
 
Data Mining: Cluster Analysis
Data Mining: Cluster AnalysisData Mining: Cluster Analysis
Data Mining: Cluster AnalysisSuman Mia
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfVIKASGUPTA127897
 

Similar to Hierarchical Clustering Methods and Approaches (20)

A0310112
A0310112A0310112
A0310112
 
clustering ppt.pptx
clustering ppt.pptxclustering ppt.pptx
clustering ppt.pptx
 
My8clst
My8clstMy8clst
My8clst
 
Paper id 26201478
Paper id 26201478Paper id 26201478
Paper id 26201478
 
Du35687693
Du35687693Du35687693
Du35687693
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
 
47 292-298
47 292-29847 292-298
47 292-298
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxK- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptx
 
K- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptxK- means clustering method based Data Mining of Network Shared Resources .pptx
K- means clustering method based Data Mining of Network Shared Resources .pptx
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Hierarchical clustering
Hierarchical clusteringHierarchical clustering
Hierarchical clustering
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster Analysis
 
Unsupervised Learning.pptx
Unsupervised Learning.pptxUnsupervised Learning.pptx
Unsupervised Learning.pptx
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
 
Data Mining: Cluster Analysis
Data Mining: Cluster AnalysisData Mining: Cluster Analysis
Data Mining: Cluster Analysis
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdf
 

More from rajshreemuthiah (20)

oracle
oracleoracle
oracle
 
quality
qualityquality
quality
 
bigdata
bigdatabigdata
bigdata
 
polymorphism
polymorphismpolymorphism
polymorphism
 
solutions and understanding text analytics
solutions and understanding text analyticssolutions and understanding text analytics
solutions and understanding text analytics
 
interface
interfaceinterface
interface
 
Testing &ampdebugging
Testing &ampdebuggingTesting &ampdebugging
Testing &ampdebugging
 
concurrency control
concurrency controlconcurrency control
concurrency control
 
Education
EducationEducation
Education
 
Formal verification
Formal verificationFormal verification
Formal verification
 
Transaction management
Transaction management Transaction management
Transaction management
 
Multi thread
Multi threadMulti thread
Multi thread
 
System testing
System testingSystem testing
System testing
 
software maintenance
software maintenancesoftware maintenance
software maintenance
 
exception handling
exception handlingexception handling
exception handling
 
e governance
e governancee governance
e governance
 
recovery management
recovery managementrecovery management
recovery management
 
Implementing polymorphism
Implementing polymorphismImplementing polymorphism
Implementing polymorphism
 
Buffer managements
Buffer managementsBuffer managements
Buffer managements
 
os linux
os linuxos linux
os linux
 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

Hierarchical Clustering Methods and Approaches

  • 1.
  • 2.  While partitioning methods meet the basic clustering requirement of organizing a set of objects into a number of exclusive groups, in some situations we may want to partition our data into groups at different levels such as in a hierarchy.  A hierarchical clustering method works by grouping data objects into a hierarchy or “tree” of clusters.  Representing data objects in the form of a hierarchy is usefull for data summarization and visualization.  The multiple-phase (or multiphase) clustering.
  • 3. I. Agglomerative versus Divisive Hierarchical Clustering:  A hierarchical clustering method can be either agglomerative or divisive, depending on whether the hierarchical decomposition is formed in a bottom-up or top-down fashion.  An agglomerative hierarchical clustering method uses a bottom-up strategy. The single cluster becomes the hierarchy`s root.  A divisive hierarchical clustering method employs a top-down strategy. It starts by placing all objects in one cluster, which is the hierarchy`s root.
  • 4.  This is a single-linkage approach in that each cluster is represented by all the objects in the cluster, and the similarity between two clusters is measured by the similarity of the closest pair of data points belonging to different cluster.  The cluster-splitting process repeats until, eventually, each new cluster contains only a single objects.  A tree structure called a dendrogram is commonly used to represent the process of hierarchical clustering.
  • 5. II. Distance Measures in Algorithmic Methods:  whether using an agglomerative method or a divisive method, a core need is to measure the distance between two clusters, where each cluster is generally a set of objects.  Four widely used measures for distance between clusters, where |p-p`| is the distance between two objects or points, p and p`; mi is the mean for cluster, Ci and ni is the number of objects in Ci.  They are also known as linkage measures.
  • 6.  Minimum distance: distmin(Ci, Cj)= min {|p-p`|} pCᵢ;p`Cj  Maximum distance: distmax(Ci, Cj)= min {|p-p`|} pCᵢ;p`Cj  Mean distance: distmin(Ci, Cj)=  Mᵢ- Mj  Average distance: distmin(Ci, Cj)= 1/ nᵢnj p - p` pCᵢ;p`Cj
  • 7. III. BIRCH: Multiphase Hierarchical Clustering Using Clustering Feature Trees  Balanced Iterative Reducing and Clustering using Hierarchical (BIRCH) is designed for clustering a large amount of numeric data by integrating hierarchical clustering(at the initial micro clustering stage) and other clustering methods such as iterative partitioning(at the macro clustering stage).  The two difficulties in agglomerative clustering methods: Scalability Inability
  • 8.  BIRCH uses the notions of clustering feature to summarize a cluster, and clustering feature tree(CF-tree) to represent a cluster hierarchy.  Consider a cluster of n d-dimensional data objects or points.  A clustering feature is essentially a summary of the statistics for the given cluster. The cluster`s centroid, xₒ, radius, R, and diameter, D, are
  • 9.  Phase 1: BIRCH scans the database to build an initial in-memory CF- tree, which can be viewed as a multilevel compression of the data that tries to preserve the data`s inherent clustering structure.  Phase 2: BIRCH applies a (selected) clustering algorithm to cluster the leaf nodes of the CF-tree, which removes sparse clusters and groups dense cluster into larger ones.
  • 10. IV. Chameleon: Multiphase Hierarchical Clustering Using Dynamic Modeling  Chameleon is a hierarchical clustering algorithm that uses dynamic modeling to determine the similarity between pairs of clusters.  The connected objects are within a clusters and the proximity of clusters.  That is, two clusters are merged if their interconnectivity high and they are close together.  Chameleon uses a k-nearest-neighbor graph approach to construct a sparse graph.
  • 11.  Chameleon uses a graph partitioning algorithm to partition the k-nearest-neighbor graph into a large number of relatively small subclusters such that it minimizes the edge cut.  Chameleon determines the similarity between each pair of clusters Cᵢ and Cj according to their relative interconnectivity, RI(Cᵢ,Cj), and their relative closeness, RC(Ci,Cj).  The processing cost for high-dimensional data may require O(n²) time for n objects in the worst case.
  • 12. V. Probabilistic Hierarchical Clustering  Algorithmic hierarchical clustering methods using linkage measures tend to be easy to understand and are often efficient in clustering.  They are commonly used in many clustering analysis applications.  Algorithmic hierarchical clustering methods can suffer from several drawbacks.  One way to look at the clustering problem is to regard the set of data objects to be clustered as sample of the underlying data generation mechanism to be analyzed or, formally, the generative model.