SlideShare a Scribd company logo
Cluster analysis using k-means
method
Vladimir Bakhrushin,
Professor, D.Sc. (Phys. & Math.)
Vladimir.Bakhrushin@gmail.com
Formulation of the problem
The task of cluster analysis is to divide the existing set of
points on a certain number of groups (clusters) so that the sum
of squares of points distances from cluster centers was minimal.
At the point of minimum all cluster centers coincide with the
centers of the corresponding areas of Voronoi diagram.
Main algorithms:
Hartigan and Wong Lloyd
Lloyd-Forgy MacQueen
The initial approximation
First step is to set the initial approximation of cluster centers.
To do this, such methods are most commonly used:
 to set the centers of clusters directly;
 to set the number of clusters k and take the first k
points coordinates as centers;
 to set the number of clusters k and take the
randomly selected k points coordinates as centers (it is
appropriate to carry out calculations for several
random runs of the algorithm).
Iteration procedure
1. Placing of each point to the cluster center of which is the
nearest to it. As a measure of closeness squared Euclidean
distance is used most commonly, but other measures of
distance also may be selected.
2. Recalculation of cluster centers coordinates. If the measure
of closeness is the Euclidean distance (or its square), cluster
centers are calculated as the arithmetic means of corresponding
coordinates of points that belong to these clusters.
The iterations are stopped when the specified maximum
number of iterations is carried out, or if there is no longer
change of the clusters composition.
Limitation
(shortcoming)
Setting the
number of
clusters (initial
approximation)
Preliminary analysis
of data
Sensitivity to
outliers
Using of
k-medians
Limitations and shortcomings
Using of random
samples from
arrays
Slow work on large
arrays
Forming of data array
a1 = matrix(c(rnorm(20, mean = 5, sd = 1), rnorm(20, mean = 5,
sd = 1)), nrow=20, ncol = 2)
a2 = matrix(c(rnorm(20, mean = 5, sd = 1), rnorm(20, mean =
13, sd = 1)), nrow=20, ncol = 2)
a3 = matrix(c(rnorm(20, mean = 12, sd = 1), rnorm(20, mean =
6, sd = 1)), nrow=20, ncol = 2)
a4 = matrix(c(rnorm(20, mean = 12, sd = 1), rnorm(20, mean =
12, sd = 1)), nrow=20, ncol = 2)
a <- rbind(a1,a2,a3,a4)
Function rbind() forms matrix a, in which the first 20 rows are the
corresponding strings of matrix a1, next 20 – matrix a2 and so
on.
Group centers
Next, we must calculate the matrix of values of formed group
centers and display the results on a screen:
Function kmeans()
For forming the clusters by k-means method we can use the function:
kmeans(x, centers, iter.max = 10, nstart = 1, algorithm =
c("Hartigan-Wong", "Lloyd", "Forgy", "MacQueen") )
x – matrix of numerical data;
centers – initial approximation of cluster centers or number of
clusters (in the latter case, the appropriate number of randomly
selected rows of the matrix will be taken as the initial approximation
x);
iter.max – maximum number of iterations;
nstart – number of random sets which must be chosen if centers – is
the number of clusters;
algorithm – choice of clustering algorithm.
Clustering results
Clustering results
Clustering results
Comparison of centers
Group
(cluster)
number
xa
ya
xcl
ycl
a1
4,613619 5,169488 4,613619 5,169488
a2
4,570456 13,396202 4,570456 13,396202
a3
11,855793 5,936099 11,855793 5,936099
a4
12,197688 11,930728 12,197688 11,930728
b1
5,531175 5,405187 5,545309 5,527677
b2
5,340795 12,983168 5,472965 13,239925
b3
11,770917 6,725708 11,842934 6,916365
Residues
Using command sd(resid.a) we can calculate residues
standard deviations. They are close to the given values of
standard deviations of initial arrays. It confirms the adequacy of
the clustering results.
Results of the division on 3
clusters
Results of the division on 5
clusters
Within and between group
variations

More Related Content

What's hot

K means clustring @jax
K means clustring @jaxK means clustring @jax
K means clustring @jax
Yaduvanshi Yadav
 
Rough K Means - Numerical Example
Rough K Means - Numerical ExampleRough K Means - Numerical Example
Rough K Means - Numerical Example
Dr.E.N.Sathishkumar
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
Afzaal Subhani
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means Clustering
Sajib Sen
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
PVP College
 
K means
K meansK means
K means clustering
K means clusteringK means clustering
K means clustering
Thomas K T
 
Clustering: A Survey
Clustering: A SurveyClustering: A Survey
Clustering: A Survey
Raffaele Capaldo
 
K means clustering | K Means ++
K means clustering | K Means ++K means clustering | K Means ++
K means clustering | K Means ++
sabbirantor
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...butest
 
Hierarchical clustering techniques
Hierarchical clustering techniquesHierarchical clustering techniques
Hierarchical clustering techniques
Md Syed Ahamad
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
AlaaZ
 
Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using Clustering
Dessy Amirudin
 
K means clustering
K means clusteringK means clustering
K means clustering
keshav goyal
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
Simplilearn
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithm
Vinit Dantkale
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERING
singh7599
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
Darshak Mehta
 

What's hot (20)

K means clustring @jax
K means clustring @jaxK means clustring @jax
K means clustring @jax
 
Rough K Means - Numerical Example
Rough K Means - Numerical ExampleRough K Means - Numerical Example
Rough K Means - Numerical Example
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
K-means Clustering
K-means ClusteringK-means Clustering
K-means Clustering
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
K means
K meansK means
K means
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Clustering: A Survey
Clustering: A SurveyClustering: A Survey
Clustering: A Survey
 
K means clustering | K Means ++
K means clustering | K Means ++K means clustering | K Means ++
K means clustering | K Means ++
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
Hierarchical clustering techniques
Hierarchical clustering techniquesHierarchical clustering techniques
Hierarchical clustering techniques
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
 
Data miningpresentation
Data miningpresentationData miningpresentation
Data miningpresentation
 
Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using Clustering
 
K means clustering
K means clusteringK means clustering
K means clustering
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithm
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERING
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
 

Similar to Cluster analysis using k-means method in R

MSE.pptx
MSE.pptxMSE.pptx
MSE.pptx
JantuRahaman
 
k-mean-clustering.ppt
k-mean-clustering.pptk-mean-clustering.ppt
k-mean-clustering.ppt
RanimeLoutar
 
k-mean-Clustering impact on AI using DSS
k-mean-Clustering impact on AI using DSSk-mean-Clustering impact on AI using DSS
k-mean-Clustering impact on AI using DSS
MarkNaguibElAbd
 
Matrix Multiplication(An example of concurrent programming)
Matrix Multiplication(An example of concurrent programming)Matrix Multiplication(An example of concurrent programming)
Matrix Multiplication(An example of concurrent programming)
Pramit Kumar
 
Face Recognition using PCA-Principal Component Analysis using MATLAB
Face Recognition using PCA-Principal Component Analysis using MATLABFace Recognition using PCA-Principal Component Analysis using MATLAB
Face Recognition using PCA-Principal Component Analysis using MATLAB
Sindhi Madhuri
 
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and  Affine Transfo...Dmitrii Tihonkih - The Iterative Closest Points Algorithm and  Affine Transfo...
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
AIST
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
 
directed-research-report
directed-research-reportdirected-research-report
directed-research-reportRyen Krusinga
 
Svm vs ls svm
Svm vs ls svmSvm vs ls svm
Svm vs ls svm
Pulipaka Sai Ravi Teja
 
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptxprincipalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
sushmitjivtode21
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
Farah M. Altufaili
 
Computer graphics lab manual
Computer graphics lab manualComputer graphics lab manual
Computer graphics lab manual
Ankit Kumar
 
AI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptxAI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptx
Syed Ejaz
 
Principal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationPrincipal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and Visualization
Marjan Sterjev
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
36rajneekant
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 
Graphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsGraphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsThirunavukarasu Mani
 
Graphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsGraphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsKetan Jani
 

Similar to Cluster analysis using k-means method in R (20)

MSE.pptx
MSE.pptxMSE.pptx
MSE.pptx
 
k-mean-clustering.ppt
k-mean-clustering.pptk-mean-clustering.ppt
k-mean-clustering.ppt
 
k-mean-Clustering impact on AI using DSS
k-mean-Clustering impact on AI using DSSk-mean-Clustering impact on AI using DSS
k-mean-Clustering impact on AI using DSS
 
Matrix Multiplication(An example of concurrent programming)
Matrix Multiplication(An example of concurrent programming)Matrix Multiplication(An example of concurrent programming)
Matrix Multiplication(An example of concurrent programming)
 
Face Recognition using PCA-Principal Component Analysis using MATLAB
Face Recognition using PCA-Principal Component Analysis using MATLABFace Recognition using PCA-Principal Component Analysis using MATLAB
Face Recognition using PCA-Principal Component Analysis using MATLAB
 
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and  Affine Transfo...Dmitrii Tihonkih - The Iterative Closest Points Algorithm and  Affine Transfo...
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
directed-research-report
directed-research-reportdirected-research-report
directed-research-report
 
Svm vs ls svm
Svm vs ls svmSvm vs ls svm
Svm vs ls svm
 
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptxprincipalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Computer graphics lab manual
Computer graphics lab manualComputer graphics lab manual
Computer graphics lab manual
 
AI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptxAI-Lec20 Clustering I - Kmean.pptx
AI-Lec20 Clustering I - Kmean.pptx
 
Unit 2
Unit 2Unit 2
Unit 2
 
Principal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationPrincipal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and Visualization
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Graphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsGraphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygons
 
Graphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygonsGraphics6 bresenham circlesandpolygons
Graphics6 bresenham circlesandpolygons
 
Lect4
Lect4Lect4
Lect4
 

More from Vladimir Bakhrushin

Decision-making on assessment of higher education institutions under uncertainty
Decision-making on assessment of higher education institutions under uncertaintyDecision-making on assessment of higher education institutions under uncertainty
Decision-making on assessment of higher education institutions under uncertainty
Vladimir Bakhrushin
 
Якими бути стандартам вищої освіти для докторів філософії
Якими бути стандартам вищої освіти для докторів філософіїЯкими бути стандартам вищої освіти для докторів філософії
Якими бути стандартам вищої освіти для докторів філософії
Vladimir Bakhrushin
 
Академічна автономія і трансформація української освіти
Академічна автономія і трансформація української освітиАкадемічна автономія і трансформація української освіти
Академічна автономія і трансформація української освіти
Vladimir Bakhrushin
 
Оптимізація в освіті і управлінні
Оптимізація в освіті і управлінні Оптимізація в освіті і управлінні
Оптимізація в освіті і управлінні
Vladimir Bakhrushin
 
Мій 2015
Мій 2015Мій 2015
Мій 2015
Vladimir Bakhrushin
 
Українські університети: Сучасні виклики та можливі відповіді
Українські університети: Сучасні виклики та можливі відповідіУкраїнські університети: Сучасні виклики та можливі відповіді
Українські університети: Сучасні виклики та можливі відповіді
Vladimir Bakhrushin
 
Два проекти закону україни
Два проекти закону україниДва проекти закону україни
Два проекти закону україни
Vladimir Bakhrushin
 
Окремі аспекти реформування освіти України з погляду системного підходу
Окремі аспекти реформування освіти України з погляду системного підходуОкремі аспекти реформування освіти України з погляду системного підходу
Окремі аспекти реформування освіти України з погляду системного підходу
Vladimir Bakhrushin
 
Decision-making in education based on multi-criteria ranking of alternatives
Decision-making in education based on multi-criteria ranking of alternativesDecision-making in education based on multi-criteria ranking of alternatives
Decision-making in education based on multi-criteria ranking of alternatives
Vladimir Bakhrushin
 
Деякі проблеми прийняття рішень в освіті
Деякі проблеми прийняття рішень в освітіДеякі проблеми прийняття рішень в освіті
Деякі проблеми прийняття рішень в освіті
Vladimir Bakhrushin
 
Закон про освіту
Закон про освітуЗакон про освіту
Закон про освіту
Vladimir Bakhrushin
 
Описова статистика в R
Описова статистика в RОписова статистика в R
Описова статистика в R
Vladimir Bakhrushin
 
Деякі графічні засоби R
Деякі графічні засоби RДеякі графічні засоби R
Деякі графічні засоби R
Vladimir Bakhrushin
 
Plot function in R
Plot function in RPlot function in R
Plot function in R
Vladimir Bakhrushin
 
Функція plot() в R
Функція plot() в RФункція plot() в R
Функція plot() в R
Vladimir Bakhrushin
 
Files,blocks and functions in R
Files,blocks and functions in RFiles,blocks and functions in R
Files,blocks and functions in R
Vladimir Bakhrushin
 
Робота з файлами даних в R, блоки виразів, цикли, функції
Робота з файлами даних в R, блоки виразів, цикли, функціїРобота з файлами даних в R, блоки виразів, цикли, функції
Робота з файлами даних в R, блоки виразів, цикли, функції
Vladimir Bakhrushin
 
Нові застосування статистичних методів в прикладних дослідженнях
Нові застосування статистичних методів в прикладних дослідженняхНові застосування статистичних методів в прикладних дослідженнях
Нові застосування статистичних методів в прикладних дослідженнях
Vladimir Bakhrushin
 
Парадоксы голосования
Парадоксы голосованияПарадоксы голосования
Парадоксы голосования
Vladimir Bakhrushin
 
Starting work with R
Starting work with RStarting work with R
Starting work with R
Vladimir Bakhrushin
 

More from Vladimir Bakhrushin (20)

Decision-making on assessment of higher education institutions under uncertainty
Decision-making on assessment of higher education institutions under uncertaintyDecision-making on assessment of higher education institutions under uncertainty
Decision-making on assessment of higher education institutions under uncertainty
 
Якими бути стандартам вищої освіти для докторів філософії
Якими бути стандартам вищої освіти для докторів філософіїЯкими бути стандартам вищої освіти для докторів філософії
Якими бути стандартам вищої освіти для докторів філософії
 
Академічна автономія і трансформація української освіти
Академічна автономія і трансформація української освітиАкадемічна автономія і трансформація української освіти
Академічна автономія і трансформація української освіти
 
Оптимізація в освіті і управлінні
Оптимізація в освіті і управлінні Оптимізація в освіті і управлінні
Оптимізація в освіті і управлінні
 
Мій 2015
Мій 2015Мій 2015
Мій 2015
 
Українські університети: Сучасні виклики та можливі відповіді
Українські університети: Сучасні виклики та можливі відповідіУкраїнські університети: Сучасні виклики та можливі відповіді
Українські університети: Сучасні виклики та можливі відповіді
 
Два проекти закону україни
Два проекти закону україниДва проекти закону україни
Два проекти закону україни
 
Окремі аспекти реформування освіти України з погляду системного підходу
Окремі аспекти реформування освіти України з погляду системного підходуОкремі аспекти реформування освіти України з погляду системного підходу
Окремі аспекти реформування освіти України з погляду системного підходу
 
Decision-making in education based on multi-criteria ranking of alternatives
Decision-making in education based on multi-criteria ranking of alternativesDecision-making in education based on multi-criteria ranking of alternatives
Decision-making in education based on multi-criteria ranking of alternatives
 
Деякі проблеми прийняття рішень в освіті
Деякі проблеми прийняття рішень в освітіДеякі проблеми прийняття рішень в освіті
Деякі проблеми прийняття рішень в освіті
 
Закон про освіту
Закон про освітуЗакон про освіту
Закон про освіту
 
Описова статистика в R
Описова статистика в RОписова статистика в R
Описова статистика в R
 
Деякі графічні засоби R
Деякі графічні засоби RДеякі графічні засоби R
Деякі графічні засоби R
 
Plot function in R
Plot function in RPlot function in R
Plot function in R
 
Функція plot() в R
Функція plot() в RФункція plot() в R
Функція plot() в R
 
Files,blocks and functions in R
Files,blocks and functions in RFiles,blocks and functions in R
Files,blocks and functions in R
 
Робота з файлами даних в R, блоки виразів, цикли, функції
Робота з файлами даних в R, блоки виразів, цикли, функціїРобота з файлами даних в R, блоки виразів, цикли, функції
Робота з файлами даних в R, блоки виразів, цикли, функції
 
Нові застосування статистичних методів в прикладних дослідженнях
Нові застосування статистичних методів в прикладних дослідженняхНові застосування статистичних методів в прикладних дослідженнях
Нові застосування статистичних методів в прикладних дослідженнях
 
Парадоксы голосования
Парадоксы голосованияПарадоксы голосования
Парадоксы голосования
 
Starting work with R
Starting work with RStarting work with R
Starting work with R
 

Recently uploaded

How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
chanes7
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
DhatriParmar
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 

Recently uploaded (20)

How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 

Cluster analysis using k-means method in R

  • 1. Cluster analysis using k-means method Vladimir Bakhrushin, Professor, D.Sc. (Phys. & Math.) Vladimir.Bakhrushin@gmail.com
  • 2. Formulation of the problem The task of cluster analysis is to divide the existing set of points on a certain number of groups (clusters) so that the sum of squares of points distances from cluster centers was minimal. At the point of minimum all cluster centers coincide with the centers of the corresponding areas of Voronoi diagram. Main algorithms: Hartigan and Wong Lloyd Lloyd-Forgy MacQueen
  • 3. The initial approximation First step is to set the initial approximation of cluster centers. To do this, such methods are most commonly used:  to set the centers of clusters directly;  to set the number of clusters k and take the first k points coordinates as centers;  to set the number of clusters k and take the randomly selected k points coordinates as centers (it is appropriate to carry out calculations for several random runs of the algorithm).
  • 4. Iteration procedure 1. Placing of each point to the cluster center of which is the nearest to it. As a measure of closeness squared Euclidean distance is used most commonly, but other measures of distance also may be selected. 2. Recalculation of cluster centers coordinates. If the measure of closeness is the Euclidean distance (or its square), cluster centers are calculated as the arithmetic means of corresponding coordinates of points that belong to these clusters. The iterations are stopped when the specified maximum number of iterations is carried out, or if there is no longer change of the clusters composition.
  • 5. Limitation (shortcoming) Setting the number of clusters (initial approximation) Preliminary analysis of data Sensitivity to outliers Using of k-medians Limitations and shortcomings Using of random samples from arrays Slow work on large arrays
  • 6. Forming of data array a1 = matrix(c(rnorm(20, mean = 5, sd = 1), rnorm(20, mean = 5, sd = 1)), nrow=20, ncol = 2) a2 = matrix(c(rnorm(20, mean = 5, sd = 1), rnorm(20, mean = 13, sd = 1)), nrow=20, ncol = 2) a3 = matrix(c(rnorm(20, mean = 12, sd = 1), rnorm(20, mean = 6, sd = 1)), nrow=20, ncol = 2) a4 = matrix(c(rnorm(20, mean = 12, sd = 1), rnorm(20, mean = 12, sd = 1)), nrow=20, ncol = 2) a <- rbind(a1,a2,a3,a4) Function rbind() forms matrix a, in which the first 20 rows are the corresponding strings of matrix a1, next 20 – matrix a2 and so on.
  • 7. Group centers Next, we must calculate the matrix of values of formed group centers and display the results on a screen:
  • 8. Function kmeans() For forming the clusters by k-means method we can use the function: kmeans(x, centers, iter.max = 10, nstart = 1, algorithm = c("Hartigan-Wong", "Lloyd", "Forgy", "MacQueen") ) x – matrix of numerical data; centers – initial approximation of cluster centers or number of clusters (in the latter case, the appropriate number of randomly selected rows of the matrix will be taken as the initial approximation x); iter.max – maximum number of iterations; nstart – number of random sets which must be chosen if centers – is the number of clusters; algorithm – choice of clustering algorithm.
  • 12. Comparison of centers Group (cluster) number xa ya xcl ycl a1 4,613619 5,169488 4,613619 5,169488 a2 4,570456 13,396202 4,570456 13,396202 a3 11,855793 5,936099 11,855793 5,936099 a4 12,197688 11,930728 12,197688 11,930728 b1 5,531175 5,405187 5,545309 5,527677 b2 5,340795 12,983168 5,472965 13,239925 b3 11,770917 6,725708 11,842934 6,916365
  • 13. Residues Using command sd(resid.a) we can calculate residues standard deviations. They are close to the given values of standard deviations of initial arrays. It confirms the adequacy of the clustering results.
  • 14. Results of the division on 3 clusters
  • 15. Results of the division on 5 clusters
  • 16. Within and between group variations