2. Two type of hierarchical clustering
• Divisive hierarchical clustering
• Top – down approach
• Agglomerative hierarchical
clustering
• Bottom – up approach
4/30/17
2
9. Impact of metrics
Distance metric
• In a 2-dimensional space, the distance
between the point (1,1) and the origin
(0,0) can be 2 under Manhattan
distance, √2 under Euclidean distance.
01
Linkage criteria
• Distance between to clusters can be
different based on linkage criteria used
02
4/30/17
9
10. Linkage criteria
• Complete linkage is the most popular metric used for hierarchical clustering. It
is less sensitive to outliers.
• Single linkage can handle non-elliptical shapes. But, single linkage can lead to
clusters that are quite heterogeneous internally and it more sensitive to outliers
and noise.
4/30/17
10
11. Pros and cons: Hierarchical Clustering
• Pros
• No assumption of particular number of clusters
• Cons
• Too slow for large data sets, O(n2 log(n) )
• Once a decision is made to combine two clusters, it can’t be undone
4/30/17
11