Clustering part 1

•Download as PPTX, PDF•

1 like•350 views

Abdul Kawsar Tushar

Basic clustering algorithms and methods - describing hierarchical clustering method and k-means clustering method.

Education

Clustering Part 1
Abdul Kawsar Tushar
Nadeem Ahmed
CSE, UAP

What is Clustering
• visualization of data
• hypothesis generation

Overview of Clustering
• Feature Selection
• Feature Extraction
• transformations of the input features to produce
new salient features.
• Inter-pattern Similarity
• Grouping

Formal Definition
• Clustering is the classification of objects into different
groups, or more precisely, the partitioning of a data set into
subsets (clusters), so that the data in each subset (ideally)
share some common trait - often according to some defined
distance measure.

Notion of a Cluster can be Ambiguous
How many clusters?
Four ClustersTwo Clusters
Six Clusters

Hierarchical Clustering: Example Using Single
Linkage

Hierarchical Clustering: Forming Clusters
• Forming clusters from dendograms

Hierarchical Clustering
• Advantages
• Dendograms are great for visualization
• Provides hierarchical relations between clusters
• Shown to be able to capture concentric clusters
• Disadvantages
• Not easy to define levels for clusters
• Experiments showed that other clustering techniques outperform hierarchical
clustering

How to Define Inter-Cluster Similarity
Similarity?
 Single Link
 Complete Link
 Average Link

How to Define Inter-Cluster Similarity
 Single Link
 Complete Link
 Average Link

Common Similarity Measures
• Distance measure will determine how the similarity of two
elements is calculated and it will influence the shape of the
clusters.
They include:
1. The Euclidean distance (also called 2-norm distance) is given by:
2. The Manhattan distance (also called taxicab norm or 1-norm) is
given by:

A Simple example showing the implementation of k-
means algorithm
(using K=2)

Step 1:
Initialization: Randomly we choose following two centroids
(k=2) for two clusters.
In this case the 2 centroid are: m1=(1.0,1.0) and
m2=(5.0,7.0).

Step 2:
• Thus, we obtain two clusters
containing:
{1,2,3} and {4,5,6,7}.
• Their new centroids are:

Step 3:
• Now using these centroids we
compute the Euclidean
distance of each object, as
shown in table.
• Therefore, the new clusters
are:
{1,2} and {3,4,5,6,7}
• Next centroids are:
m1=(1.25,1.5) and m2 =
(3.9,5.1)

• Step 4 :
The clusters obtained are:
{1,2} and {3,4,5,6,7}
• Therefore, there is no change
in the cluster.
• Thus, the algorithm comes to
a halt here and final result
consist of 2 clusters {1,2} and
{3,4,5,6,7}.

Two different K-means Clustering
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Sub-optimal Clustering
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Optimal Clustering
Original Points

Importance of Choosing Initial Centroids
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 1
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 3
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 4
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 5
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
3
x
y
Iteration 6

Can k-means Handle Non-spherical Clusters?

Let’s Try Single Linkage Hierarchical Clustering

What's hot

K-Means clustring @jaxAjay Iet

Cure, Clustering AlgorithmLino Possamai

K-means Clustering Algorithm with Matlab Source codegokulprasath06

K means clusteringThomas K T

Clustering in artificial intelligence Karam Munir Butt

An improvement in k mean clustering algorithm using better time and accuracyijpla

K means Clustering AlgorithmKasun Ranga Wijeweera

CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...butest

K-Means manual workDr.E.N.Sathishkumar

Hierarchical clusteringishmecse13

Neural nw k meansEng. Dr. Dennis N. Mwighusa

K MEANS CLUSTERINGsingh7599

K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...Edureka!

Hierachical clusteringTilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL

K means clustering | K Means ++sabbirantor

Data miningpresentationManoj Krishna Yadavalli

Hierarchical ClusteringMegha Sharma

K meansElias Hasnat

Hierarchical ClusteringCarlos Castillo (ChaTo)

Rough K Means - Numerical ExampleDr.E.N.Sathishkumar

What's hot (20)

K-Means clustring @jax

Cure, Clustering Algorithm

K-means Clustering Algorithm with Matlab Source code

K means clustering

Clustering in artificial intelligence

An improvement in k mean clustering algorithm using better time and accuracy

K means Clustering Algorithm

CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...

K-Means manual work

Hierarchical clustering

Neural nw k means

K MEANS CLUSTERING

K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...

Hierachical clustering

K means clustering | K Means ++

Data miningpresentation

Hierarchical Clustering

K means

Hierarchical Clustering

Rough K Means - Numerical Example

Viewers also liked

Literature Survey: Clustering TechniqueEditor IJCATR

Clustering in wireless sensor networks with compressive sensingshivani Shivanichou1

Aplicaciones Difusas: Algoritmo k mediasLuis Fernando Aguas Bucheli

Fuzzy c-Means Clustering AlgorithmsJustin Cletus

Clustering: A SurveyRaffaele Capaldo

Clustering in Data MiningArchana Swaminathan

Types of clustering and different types of clustering algorithmsPrashanth Guntal

Viewers also liked (7)

Literature Survey: Clustering Technique

Clustering in wireless sensor networks with compressive sensing

Aplicaciones Difusas: Algoritmo k medias

Fuzzy c-Means Clustering Algorithms

Clustering: A Survey

Clustering in Data Mining

Types of clustering and different types of clustering algorithms

Similar to Clustering part 1

kmean clusteringTilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL

Clustering on DSSEnaam Alotaibi

Unsupervised Learning in Machine LearningPyingkodi Maran

Unsupervised Learning-Clustering Algorithms.pptxjasontseng19

Lecture_3_k-mean-clustering.pptSyedNahin1

ClusteringRashmi Bhat

machine learning - Clustering in RSudhakar Chavan

Pattern recognition binoy k means clustering108kaushik

Unsupervised Learning Clustering KMean and Hirarchical.pptxFaridAliMousa1

Data Mining Lecture_7.pptxSubrata Kumer Paul

Advanced database and data mining & clustering conceptsNithyananthSengottai

PPT s10-machine vision-s2Binus Online Learning

CLUSTER ANALYSIS ALGORITHMS.pptxShwetapadmaBabu1

Data mining and warehousingSwetha544947

05 Clustering in Data MiningValerii Klymchuk

ClusteringMd. Hasnat Shoheb

SPSS Step-by-Step Tutorial and Statistical Guides by StatsworkStats Statswork

26-Clustering MTech-2017.pptvikassingh569137

CSA 3702 machine learning module 3Nandhini S

K-Means Clustering Algorithm.pptxJebaRaj26

Similar to Clustering part 1 (20)

kmean clustering

Clustering on DSS

Unsupervised Learning in Machine Learning

Unsupervised Learning-Clustering Algorithms.pptx

Lecture_3_k-mean-clustering.ppt

Clustering

machine learning - Clustering in R

Pattern recognition binoy k means clustering

Unsupervised Learning Clustering KMean and Hirarchical.pptx

Data Mining Lecture_7.pptx

Advanced database and data mining & clustering concepts

PPT s10-machine vision-s2

CLUSTER ANALYSIS ALGORITHMS.pptx

Data mining and warehousing

05 Clustering in Data Mining

Clustering

SPSS Step-by-Step Tutorial and Statistical Guides by Statswork

26-Clustering MTech-2017.ppt

CSA 3702 machine learning module 3

K-Means Clustering Algorithm.pptx

Recently uploaded

Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre

भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke

Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR

Proudly South Africa powerpoint Thorisha.pptxthorishapillay1

Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle

Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari

Paris 2024 Olympic Geographies - an activityGeoBlogs

Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton

A Critique of the Proposed National Education Policy ReformChameera Dedduwage

Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha

Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019

Mastering the Unannounced Regulatory InspectionSafetyChain Software

Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr

EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3

9953330565 Low Rate Call Girls In Rohini Delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood

Introduction to AI in Higher Education_draft.pptxpboyjonauth

Recently uploaded (20)

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx

भारत-रोम व्यापार.pptx, Indo-Roman Trade,

Solving Puzzles Benefits Everyone (English).pptx

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

Proudly South Africa powerpoint Thorisha.pptx

Hybridoma Technology ( Production , Purification , and Application )

Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf

Paris 2024 Olympic Geographies - an activity

Science 7 - LAND and SEA BREEZE and its Characteristics

A Critique of the Proposed National Education Policy Reform

Call Girls in Dwarka Mor Delhi Contact Us 9654467111

Sanyam Choudhary Chemistry practical.pdf

Mastering the Unannounced Regulatory Inspection

Class 11 Legal Studies Ch-1 Concept of State .pdf

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...

EPANDING THE CONTENT OF AN OUTLINE using notes.pptx

9953330565 Low Rate Call Girls In Rohini Delhi NCR

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT

Introduction to AI in Higher Education_draft.pptx

Clustering part 1

1. Clustering Part 1 Abdul Kawsar Tushar Nadeem Ahmed CSE, UAP

2. What is Clustering • visualization of data • hypothesis generation

3. Overview of Clustering • Feature Selection • Feature Extraction • transformations of the input features to produce new salient features. • Inter-pattern Similarity • Grouping

4. Formal Definition • Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often according to some defined distance measure.

5. Notion of a Cluster can be Ambiguous How many clusters? Four ClustersTwo Clusters Six Clusters

6. Hierarchical Clustering: Example

7. Hierarchical Clustering: Example Using Single Linkage

8. Hierarchical Clustering: Forming Clusters • Forming clusters from dendograms

9. Hierarchical Clustering • Advantages • Dendograms are great for visualization • Provides hierarchical relations between clusters • Shown to be able to capture concentric clusters • Disadvantages • Not easy to define levels for clusters • Experiments showed that other clustering techniques outperform hierarchical clustering

10. How to Define Inter-Cluster Similarity Similarity?  Single Link  Complete Link  Average Link

11. How to Define Inter-Cluster Similarity  Single Link  Complete Link  Average Link

12. How to Define Inter-Cluster Similarity  Single Link  Complete Link  Average Link

13. How to Define Inter-Cluster Similarity  Single Link  Complete Link  Average Link

14. Common Similarity Measures • Distance measure will determine how the similarity of two elements is calculated and it will influence the shape of the clusters. They include: 1. The Euclidean distance (also called 2-norm distance) is given by: 2. The Manhattan distance (also called taxicab norm or 1-norm) is given by:

15. A Simple example showing the implementation of k- means algorithm (using K=2)

16. Step 1: Initialization: Randomly we choose following two centroids (k=2) for two clusters. In this case the 2 centroid are: m1=(1.0,1.0) and m2=(5.0,7.0).

17. Step 2: • Thus, we obtain two clusters containing: {1,2,3} and {4,5,6,7}. • Their new centroids are:

18. Step 3: • Now using these centroids we compute the Euclidean distance of each object, as shown in table. • Therefore, the new clusters are: {1,2} and {3,4,5,6,7} • Next centroids are: m1=(1.25,1.5) and m2 = (3.9,5.1)

19. • Step 4 : The clusters obtained are: {1,2} and {3,4,5,6,7} • Therefore, there is no change in the cluster. • Thus, the algorithm comes to a halt here and final result consist of 2 clusters {1,2} and {3,4,5,6,7}.

20. PLOT

21. (with K=3) Step 1 Step 2

22. PLOT

23. Two different K-means Clustering -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Sub-optimal Clustering -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Optimal Clustering Original Points

24. Importance of Choosing Initial Centroids -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 3 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 4 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 6

25. Importance of Choosing Initial Centroids -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 3 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 4 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 x y Iteration 5

26. Clustering Non-clustered Data

27. Getting Stuck In A Local Minimum

28. Can k-means Handle Non-spherical Clusters?

29. …Maybe not.

30. Let’s Try Single Linkage Hierarchical Clustering

31. K-means with Polar Coordinates

Clustering part 1

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Clustering part 1

Similar to Clustering part 1 (20)

Recently uploaded

Recently uploaded (20)

Clustering part 1