SlideShare a Scribd company logo
1 of 16
Download to read offline
3/22/2012
1
K-means Algorithmg
Cluster Analysis in Data Mining
Presented by Zijun Zhang
Algorithm Description
 What is Cluster Analysis?
Cluster analysis groups data objects based only on
information found in data that describes the objects and their
relationships.
Goal of Cluster Analysis
The objects within a group be similar to one another andj g p
different from the objects in other groups
3/22/2012
2
Algorithm Description
 Types of Clustering
Partitioning and Hierarchical Clustering
 Hierarchical Clustering
- A set of nested clusters organized as a hierarchical tree
 Partitioning Clusteringg g
- A division data objects into non-overlapping subsets
(clusters) such that each data object is in exactly one subset
Algorithm Description
p4
p1
p3
p2
A Partitional Clustering Hierarchical Clustering
3/22/2012
3
Algorithm Description
 What is K-means?
1. Partitional clustering approach
2. Each cluster is associated with a centroid (center point)
3. Each point is assigned to the cluster with the closest centroid
4 Number of clusters K must be specified4. Number of clusters, K, must be specified
Algorithm Statement
 Basic Algorithm of K-means
3/22/2012
4
Algorithm Statement
 Details of K-means
1 Initial centroids are often chosen randomly1. Initial centroids are often chosen randomly.
- Clusters produced vary from one run to another
2. The centroid is (typically) the mean of the points in the cluster.
3.‘Closeness’ is measured by Euclidean distance, cosine similarity, correlation,
etc.
4. K-means will converge for common similarity measures mentioned above.
5. Most of the convergence happens in the first few iterations.5. Most of the convergence happens in the first few iterations.
- Often the stopping condition is changed to ‘Until relatively few points
change clusters’
Algorithm Statement
 Euclidean Distance
A simple example: Find the distance between two points, the original
and the point (3,4)
3/22/2012
5
Algorithm Statement
 Update Centroid
We use the following equation to calculate the n dimensionalWe use the following equation to calculate the n dimensional
centroid point amid k n-dimensional points
Example: Find the centroid of 3 2D points, (2,4), (5,2)
and (8,9)and (8,9)
Example of K-means
 Select three initial centroids
1
1.5
2
2.5
3
y
Iteration 1
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
x
3/22/2012
6
Example of K-means
 Assigning the points to nearest K clusters and re-compute the
centroids
1
1.5
2
2.5
3
y
Iteration 3
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
x
Example of K-means
 K-means terminates since the centroids converge to certain points
and do not change.
1
1.5
2
2.5
3
y
Iteration 6
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
x
3/22/2012
7
Example of K-means
2
2.5
3
Iteration 1
2
2.5
3
Iteration 2
2
2.5
3
Iteration 3
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
x
y
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
x
y
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
x
y
3
Iteration 4
3
Iteration 5
3
Iteration 6
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
x
y
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
x
y
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
0.5
1
1.5
2
2.5
x
y
Example of K-means
 Demo of K-means
3/22/2012
8
Evaluating K-means Clusters
 Most common measure is Sum of Squared Error (SSE)
 For each point, the error is the distance to the nearest cluster
 To get SSE we square these errors and sum them To get SSE, we square these errors and sum them.
 x is a data point in cluster Ci and mi is the representative point for cluster
Ci
 can show that mi corresponds to the center (mean) of the cluster
 Given two clusters we can choose the one with the smallest error
 

K
i Cx
i
i
xmdistSSE
1
2
),(
 Given two clusters, we can choose the one with the smallest error
 One easy way to reduce SSE is to increase K, the number of clusters
 A good clustering with smaller K can have a lower SSE than a poor
clustering with higher K
Problem about K
 How to choose K?
1. Use another clustering method, like EM.
2. Run algorithm on data with several different values of K.
3. Use the prior knowledge about the characteristics of the problem.
3/22/2012
9
Problem about initialize centers
 How to initialize centers?
- Random Points in Feature Space
- Random Points From Data Set
- Look For Dense Regions of Space
- Space them uniformly around the feature space
Cluster Quality
3/22/2012
10
Cluster Quality
Limitation of K-means
 K-means has problems when clusters are of
differingg
 Sizes
 Densities
 Non-globular shapes
K h bl h h d i K-means has problems when the data contains
outliers.
3/22/2012
11
Limitation of K-means
Original Points K-means (3 Clusters)
Application of K-means
 Image Segmentation
The k-means clustering algorithm is commonly used in
computer vision as a form of image segmentation. The
results of the segmentation are used to aid border detection
and object recognition.
3/22/2012
12
K-means in Wind Energy
 Clustering can be applied to detect
b lit i i d d t ( b labnormality in wind data (abnormal
vibration)
 Monitor Wind Turbine Conditions
 Beneficial to preventative maintenance
 K means can be more powerful and K-means can be more powerful and
applicable after appropriate modifications
K-means in Wind Energy
Modified K-means
3/22/2012
13
K-means in Wind Energy
 Clustering cost function
2
1
1
( , , )
j i
k
j i
i C
d k
n  
 
  
 
 
 x
x c x c
1
k
i
i
n m

 
21 k  
  1
1
1
( , , )
j i
j ik
i C
i
i
d k
m  

  
 
 
 
 x
x c x c
K-means in Wind Energy
 Determination of k value
0 02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Costofclustering
0
0.01
0.02
2 3 4 5 6 7 8 9 10 11 12 13
Numberof clusters
3/22/2012
14
K-means in Wind Energy
 Summary of clustering result
No. of Cluster c1 (Drive train acc.) c2 (Wind speed) Number of points Percentage (%)
1 71.9612 9.97514 313 8.75524
2 65.8387 9.42031 295 8.25175
3 233.9184 9.57990 96 2.68531
4 17.4187 7.13375 240 6.71329
5 3.3706 8.99211 437 12.22378
6 0.3741 0.40378 217 6.06993
7 18.1361 8.09900 410 11.46853
8 0.7684 10.56663 419 11.72028
9 62.0493 8.81445 283 7.91608
10 81.7522 10.67867 181 5.06294
11 83.8067 8.10663 101 2.82517
12 0.9283 9.78571 583 16.30769
K-means in Wind Energy
 Visualization of monitoring result
3/22/2012
15
K-means in Wind Energy
 Visualization of vibration under normal condition
14
4
6
8
10
12
14
Windspeed(m/s)
0
2
0 20 40 60 80 100 120 140
Drive train acceleration
Reference
1. Introduction to Data Mining, P.N. Tan, M. Steinbach, V. Kumar, Addison Wesley
2. An efficient k-means clustering algorithm: Analysis and implementation, T. Kanungo, D. M.
Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Y. Wu, IEEE Trans. PatternAnalysis
and Machine Intelligence, 24 (2002), 881-892
3. http://www.cs.cmu.edu/~cga/ai-course/kmeans.pdf
4. http://www.cse.msstate.edu/~url/teaching/CSE6633Fall08/lec16%20k-means.pdf
3/22/2012
16
Appendix One
Original Points K-means (2 Clusters)
Appendix Two
Original Points K-means Clusters
One solution is to use many clusters.
Find parts of clusters, but need to put together.

More Related Content

Similar to Data science course in chennai (3)

CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...butest
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptSubrata Kumer Paul
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.pptSueMiu
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Lecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptLecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptSyedNahin1
 
Analysis and implementation of modified k medoids
Analysis and implementation of modified k medoidsAnalysis and implementation of modified k medoids
Analysis and implementation of modified k medoidseSAT Publishing House
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyijpla
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering108kaushik
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxnikshaikh786
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringIJCSIS Research Publications
 
ML basic & clustering
ML basic & clusteringML basic & clustering
ML basic & clusteringmonalisa Das
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10mqasimsheikh5
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul
 

Similar to Data science course in chennai (3) (19)

CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
Chapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.pptChapter 11. Cluster Analysis Advanced Methods.ppt
Chapter 11. Cluster Analysis Advanced Methods.ppt
 
11 clusadvanced
11 clusadvanced11 clusadvanced
11 clusadvanced
 
Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
11ClusAdvanced.ppt
11ClusAdvanced.ppt11ClusAdvanced.ppt
11ClusAdvanced.ppt
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Lecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptLecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.ppt
 
K means report
K means reportK means report
K means report
 
Analysis and implementation of modified k medoids
Analysis and implementation of modified k medoidsAnalysis and implementation of modified k medoids
Analysis and implementation of modified k medoids
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracy
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
 
ClusetrigBasic.ppt
ClusetrigBasic.pptClusetrigBasic.ppt
ClusetrigBasic.ppt
 
Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means Clustering
 
ML basic & clustering
ML basic & clusteringML basic & clustering
ML basic & clustering
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10Data mining concepts and techniques Chapter 10
Data mining concepts and techniques Chapter 10
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis Presentation
 
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
 

More from prathyusha1234

Business analytics course in delhi
Business analytics course in delhiBusiness analytics course in delhi
Business analytics course in delhiprathyusha1234
 
Business analytics courses in india
Business analytics courses in indiaBusiness analytics courses in india
Business analytics courses in indiaprathyusha1234
 
Business analytics courses in india
Business analytics courses in indiaBusiness analytics courses in india
Business analytics courses in indiaprathyusha1234
 
Business analytics course in chennai
Business analytics course in chennaiBusiness analytics course in chennai
Business analytics course in chennaiprathyusha1234
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online courseprathyusha1234
 
Business analytics training in bangalore
Business analytics training in bangaloreBusiness analytics training in bangalore
Business analytics training in bangaloreprathyusha1234
 
Pmi acp training in hyderabad
Pmi acp training in hyderabadPmi acp training in hyderabad
Pmi acp training in hyderabadprathyusha1234
 
Pmi acp training in hyderabad
Pmi acp training in hyderabadPmi acp training in hyderabad
Pmi acp training in hyderabadprathyusha1234
 
Business analytics course in mumbai
Business analytics course in mumbaiBusiness analytics course in mumbai
Business analytics course in mumbaiprathyusha1234
 
Business analytics course in chennai
Business analytics course in chennaiBusiness analytics course in chennai
Business analytics course in chennaiprathyusha1234
 
Business analytics course in delhi
Business analytics course in delhiBusiness analytics course in delhi
Business analytics course in delhiprathyusha1234
 
Business analytics training in hyderabad
Business analytics training in hyderabadBusiness analytics training in hyderabad
Business analytics training in hyderabadprathyusha1234
 

More from prathyusha1234 (20)

Business analytics course in delhi
Business analytics course in delhiBusiness analytics course in delhi
Business analytics course in delhi
 
Business analytics courses in india
Business analytics courses in indiaBusiness analytics courses in india
Business analytics courses in india
 
Business analytics courses in india
Business analytics courses in indiaBusiness analytics courses in india
Business analytics courses in india
 
Business analytics course in chennai
Business analytics course in chennaiBusiness analytics course in chennai
Business analytics course in chennai
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online course
 
Business analytics training in bangalore
Business analytics training in bangaloreBusiness analytics training in bangalore
Business analytics training in bangalore
 
Pmi acp training in hyderabad
Pmi acp training in hyderabadPmi acp training in hyderabad
Pmi acp training in hyderabad
 
Pmi acp training in hyderabad
Pmi acp training in hyderabadPmi acp training in hyderabad
Pmi acp training in hyderabad
 
Data science training
Data science trainingData science training
Data science training
 
Business analytics course in mumbai
Business analytics course in mumbaiBusiness analytics course in mumbai
Business analytics course in mumbai
 
Business analytics course in chennai
Business analytics course in chennaiBusiness analytics course in chennai
Business analytics course in chennai
 
Business analytics course in delhi
Business analytics course in delhiBusiness analytics course in delhi
Business analytics course in delhi
 
Pmi acp training
Pmi acp trainingPmi acp training
Pmi acp training
 
Pmi acp training
Pmi acp trainingPmi acp training
Pmi acp training
 
Pmi acp training
Pmi acp trainingPmi acp training
Pmi acp training
 
Pmi acp training
Pmi acp trainingPmi acp training
Pmi acp training
 
Business analytics training in hyderabad
Business analytics training in hyderabadBusiness analytics training in hyderabad
Business analytics training in hyderabad
 
Pmi acp training
Pmi acp trainingPmi acp training
Pmi acp training
 
Pmi acp training
Pmi acp trainingPmi acp training
Pmi acp training
 
Pmi acp training
Pmi acp trainingPmi acp training
Pmi acp training
 

Recently uploaded

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 

Recently uploaded (20)

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 

Data science course in chennai (3)

  • 1. 3/22/2012 1 K-means Algorithmg Cluster Analysis in Data Mining Presented by Zijun Zhang Algorithm Description  What is Cluster Analysis? Cluster analysis groups data objects based only on information found in data that describes the objects and their relationships. Goal of Cluster Analysis The objects within a group be similar to one another andj g p different from the objects in other groups
  • 2. 3/22/2012 2 Algorithm Description  Types of Clustering Partitioning and Hierarchical Clustering  Hierarchical Clustering - A set of nested clusters organized as a hierarchical tree  Partitioning Clusteringg g - A division data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset Algorithm Description p4 p1 p3 p2 A Partitional Clustering Hierarchical Clustering
  • 3. 3/22/2012 3 Algorithm Description  What is K-means? 1. Partitional clustering approach 2. Each cluster is associated with a centroid (center point) 3. Each point is assigned to the cluster with the closest centroid 4 Number of clusters K must be specified4. Number of clusters, K, must be specified Algorithm Statement  Basic Algorithm of K-means
  • 4. 3/22/2012 4 Algorithm Statement  Details of K-means 1 Initial centroids are often chosen randomly1. Initial centroids are often chosen randomly. - Clusters produced vary from one run to another 2. The centroid is (typically) the mean of the points in the cluster. 3.‘Closeness’ is measured by Euclidean distance, cosine similarity, correlation, etc. 4. K-means will converge for common similarity measures mentioned above. 5. Most of the convergence happens in the first few iterations.5. Most of the convergence happens in the first few iterations. - Often the stopping condition is changed to ‘Until relatively few points change clusters’ Algorithm Statement  Euclidean Distance A simple example: Find the distance between two points, the original and the point (3,4)
  • 5. 3/22/2012 5 Algorithm Statement  Update Centroid We use the following equation to calculate the n dimensionalWe use the following equation to calculate the n dimensional centroid point amid k n-dimensional points Example: Find the centroid of 3 2D points, (2,4), (5,2) and (8,9)and (8,9) Example of K-means  Select three initial centroids 1 1.5 2 2.5 3 y Iteration 1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 x
  • 6. 3/22/2012 6 Example of K-means  Assigning the points to nearest K clusters and re-compute the centroids 1 1.5 2 2.5 3 y Iteration 3 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 x Example of K-means  K-means terminates since the centroids converge to certain points and do not change. 1 1.5 2 2.5 3 y Iteration 6 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 x
  • 7. 3/22/2012 7 Example of K-means 2 2.5 3 Iteration 1 2 2.5 3 Iteration 2 2 2.5 3 Iteration 3 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 x y -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 x y -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 x y 3 Iteration 4 3 Iteration 5 3 Iteration 6 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 x y -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 x y -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 x y Example of K-means  Demo of K-means
  • 8. 3/22/2012 8 Evaluating K-means Clusters  Most common measure is Sum of Squared Error (SSE)  For each point, the error is the distance to the nearest cluster  To get SSE we square these errors and sum them To get SSE, we square these errors and sum them.  x is a data point in cluster Ci and mi is the representative point for cluster Ci  can show that mi corresponds to the center (mean) of the cluster  Given two clusters we can choose the one with the smallest error    K i Cx i i xmdistSSE 1 2 ),(  Given two clusters, we can choose the one with the smallest error  One easy way to reduce SSE is to increase K, the number of clusters  A good clustering with smaller K can have a lower SSE than a poor clustering with higher K Problem about K  How to choose K? 1. Use another clustering method, like EM. 2. Run algorithm on data with several different values of K. 3. Use the prior knowledge about the characteristics of the problem.
  • 9. 3/22/2012 9 Problem about initialize centers  How to initialize centers? - Random Points in Feature Space - Random Points From Data Set - Look For Dense Regions of Space - Space them uniformly around the feature space Cluster Quality
  • 10. 3/22/2012 10 Cluster Quality Limitation of K-means  K-means has problems when clusters are of differingg  Sizes  Densities  Non-globular shapes K h bl h h d i K-means has problems when the data contains outliers.
  • 11. 3/22/2012 11 Limitation of K-means Original Points K-means (3 Clusters) Application of K-means  Image Segmentation The k-means clustering algorithm is commonly used in computer vision as a form of image segmentation. The results of the segmentation are used to aid border detection and object recognition.
  • 12. 3/22/2012 12 K-means in Wind Energy  Clustering can be applied to detect b lit i i d d t ( b labnormality in wind data (abnormal vibration)  Monitor Wind Turbine Conditions  Beneficial to preventative maintenance  K means can be more powerful and K-means can be more powerful and applicable after appropriate modifications K-means in Wind Energy Modified K-means
  • 13. 3/22/2012 13 K-means in Wind Energy  Clustering cost function 2 1 1 ( , , ) j i k j i i C d k n             x x c x c 1 k i i n m    21 k     1 1 1 ( , , ) j i j ik i C i i d k m              x x c x c K-means in Wind Energy  Determination of k value 0 02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 Costofclustering 0 0.01 0.02 2 3 4 5 6 7 8 9 10 11 12 13 Numberof clusters
  • 14. 3/22/2012 14 K-means in Wind Energy  Summary of clustering result No. of Cluster c1 (Drive train acc.) c2 (Wind speed) Number of points Percentage (%) 1 71.9612 9.97514 313 8.75524 2 65.8387 9.42031 295 8.25175 3 233.9184 9.57990 96 2.68531 4 17.4187 7.13375 240 6.71329 5 3.3706 8.99211 437 12.22378 6 0.3741 0.40378 217 6.06993 7 18.1361 8.09900 410 11.46853 8 0.7684 10.56663 419 11.72028 9 62.0493 8.81445 283 7.91608 10 81.7522 10.67867 181 5.06294 11 83.8067 8.10663 101 2.82517 12 0.9283 9.78571 583 16.30769 K-means in Wind Energy  Visualization of monitoring result
  • 15. 3/22/2012 15 K-means in Wind Energy  Visualization of vibration under normal condition 14 4 6 8 10 12 14 Windspeed(m/s) 0 2 0 20 40 60 80 100 120 140 Drive train acceleration Reference 1. Introduction to Data Mining, P.N. Tan, M. Steinbach, V. Kumar, Addison Wesley 2. An efficient k-means clustering algorithm: Analysis and implementation, T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Y. Wu, IEEE Trans. PatternAnalysis and Machine Intelligence, 24 (2002), 881-892 3. http://www.cs.cmu.edu/~cga/ai-course/kmeans.pdf 4. http://www.cse.msstate.edu/~url/teaching/CSE6633Fall08/lec16%20k-means.pdf
  • 16. 3/22/2012 16 Appendix One Original Points K-means (2 Clusters) Appendix Two Original Points K-means Clusters One solution is to use many clusters. Find parts of clusters, but need to put together.