SlideShare a Scribd company logo
K-Means Clustering Problem
            Ahmad Sabiq
          Febri Maspiyanti
       Indah Kuntum Khairina
          Wiwin Farhania
              Yonatan
What is k-means?
• To partition n objects into k clusters, based on
  attributes.
  – Objects of the same cluster are close their
    attributes are related to each other.
  – Objects of different clusters are far apart their
    attributes are very dissimilar.
Algorithm
• Input: n objects, k (integer k ≤ n)
• Output: k clusters
• Steps:
   1. Select k initial centroids.
   2. Calculate the distance between each object and
      each centroid.
   3. Assign each object to the cluster with the nearest
      centroid.
   4. Recalculate each centroid.
   5. If the centroids don’t change, stop (convergence).
      Otherwise, back to step 2.
• Complexity: O(k.n.d.total_iteration)
Initialization
• Why is it important? What does it affect?
  – Clustering result local optimum!
  – Total iteration / complexity
Good Initialization
3 clusters with 2 iterations…
Bad Initialization
3 clusters with 4 iterations…
Initialization Methods
1.   Random
2.   Forgy
3.   Macqueen
4.   Kaufman
Random
• Algorithm:
  1. Assigns each object to a random cluster.
  2. Computes the initial centroid of each cluster.
Random
Random
Random
9
8
7
6
5
4
3
2
1
0
    0   5   10    15   20   25   30   35
Forgy
• Algorithm:
  1. Chooses k objects at random and uses them as the initial
     centroids.
Forgy
9
8
7
6
5
4
3
2
1
0
    0   5   10   15   20   25   30   35
MacQueen
• Algorithm:
  1. Chooses k objects at random and uses them as the initial
     centroids.
  2. Assign each object to the cluster with the nearest
     centroid.
  3. After each assignment, recalculate the centroid.
MacQueen
9
8
7
6
5
4
3
2
1
0
    0   5   10     15   20   25   30   35
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
                        C=0




d = 24,33

            D = 15,52
Kaufman
          C=0


          C=0   C=0

          C=0




          C=0
Kaufman
                       C=0


                       C=0   C=0

                       C=0



∑C1 = 2,74
                       C=0
Kaufman
                                       ∑C5 = 52,55

                                       ∑C6 = 55,88   ∑C9 = 42,69

                                  ∑C7 = 53,77




∑C1 = 2,74                           ∑C8 = 51,16

         ∑C2 = 12,,21


         ∑C3 = 12,36



        ∑C3 = 8,38
Kaufman
                                       ∑C5 = 52,55

                                       ∑C6 = 55,88   ∑C9 = 42,69

                                  ∑C7 = 53,77




∑C1 = 2,74                           ∑C8 = 51,16

         ∑C2 = 12,,21


         ∑C3 = 12,36



        ∑C3 = 8,38
Reference
1. J.M. Peña, J.A. Lozano, and P. Larrañaga. An Empirical
   Comparison of Four Initialization Methods for the K-
   Means Algorithm. Pattern Recognition Letters, vol. 20,
   pp. 1027–1040. 1999.
2. J.R. Cano, O. Cordón, F. Herrera, and L. Sánchez. A
   Greedy Randomized Adaptive Search Procedure
   Applied to the Clustering Problem as an Initialization
   Process Using K-Means as a Local Search Procedure.
   Journal of Intelligent and Fuzzy Systems, vol. 12, pp.
   235 – 242. 2002.
3. L. Kaufman and P.J. Rousseeuw. Finding Groups in
   Data: An Introduction to Cluster Analysis. Wiley. 1990.
Questions
1. Kenapa inisialisasi penting pada k-means?
2. Metode inisialisasi apa yang memiliki greedy
   choice property?
3. Jelaskan kompleksitas O(nkd) pada metode
   Random.

More Related Content

What's hot

モンテカルロサンプリング
モンテカルロサンプリングモンテカルロサンプリング
モンテカルロサンプリング
Kosei ABE
 
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
Ichigaku Takigawa
 
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Dongmin Choi
 
公開鍵暗号(3): 離散対数問題
公開鍵暗号(3): 離散対数問題公開鍵暗号(3): 離散対数問題
公開鍵暗号(3): 離散対数問題
Joe Suzuki
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
skylian
 
ベイズ推論とシミュレーション法の基礎
ベイズ推論とシミュレーション法の基礎ベイズ推論とシミュレーション法の基礎
ベイズ推論とシミュレーション法の基礎
Tomoshige Nakamura
 
Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)
yukihiro domae
 
Information-Theoretic Metric Learning
Information-Theoretic Metric LearningInformation-Theoretic Metric Learning
Information-Theoretic Metric LearningKoji Matsuda
 
次世代量子情報技術 量子アニーリングが拓く新時代 -- 情報処理と物理学のハーモニー --
次世代量子情報技術 量子アニーリングが拓く新時代 -- 情報処理と物理学のハーモニー --次世代量子情報技術 量子アニーリングが拓く新時代 -- 情報処理と物理学のハーモニー --
次世代量子情報技術 量子アニーリングが拓く新時代 -- 情報処理と物理学のハーモニー --
Shu Tanaka
 
パターン認識 08 09 k-近傍法 lvq
パターン認識 08 09 k-近傍法 lvqパターン認識 08 09 k-近傍法 lvq
パターン認識 08 09 k-近傍法 lvqsleipnir002
 
Ch2 3-informed (heuristic) search
Ch2 3-informed (heuristic) searchCh2 3-informed (heuristic) search
Ch2 3-informed (heuristic) search
chandsek666
 
統計的学習の基礎 5章前半(~5.6)
統計的学習の基礎 5章前半(~5.6)統計的学習の基礎 5章前半(~5.6)
統計的学習の基礎 5章前半(~5.6)
Kota Mori
 
Air Cargo transport
 Air Cargo transport Air Cargo transport
Air Cargo transport
Faimin Khan
 
虫食算を作るアルゴリズム 公表Ver
虫食算を作るアルゴリズム 公表Ver虫食算を作るアルゴリズム 公表Ver
虫食算を作るアルゴリズム 公表Ver
Kensuke Otsuki
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
Suresh Pokharel
 
大規模グラフアルゴリズムの最先端
大規模グラフアルゴリズムの最先端大規模グラフアルゴリズムの最先端
大規模グラフアルゴリズムの最先端Takuya Akiba
 
セミパラメトリック推論の基礎
セミパラメトリック推論の基礎セミパラメトリック推論の基礎
セミパラメトリック推論の基礎
Daisuke Yoneoka
 
Introduction to Graph and Graph Coloring
Introduction to Graph and Graph Coloring Introduction to Graph and Graph Coloring
Introduction to Graph and Graph Coloring
Darwish Ahmad
 
深層ガウス過程とアクセントの潜在変数表現に基づく音声合成の検討
深層ガウス過程とアクセントの潜在変数表現に基づく音声合成の検討深層ガウス過程とアクセントの潜在変数表現に基づく音声合成の検討
深層ガウス過程とアクセントの潜在変数表現に基づく音声合成の検討
Tomoki Koriyama
 

What's hot (20)

モンテカルロサンプリング
モンテカルロサンプリングモンテカルロサンプリング
モンテカルロサンプリング
 
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
 
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
 
公開鍵暗号(3): 離散対数問題
公開鍵暗号(3): 離散対数問題公開鍵暗号(3): 離散対数問題
公開鍵暗号(3): 離散対数問題
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
ベイズ推論とシミュレーション法の基礎
ベイズ推論とシミュレーション法の基礎ベイズ推論とシミュレーション法の基礎
ベイズ推論とシミュレーション法の基礎
 
Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)Graph convolution (スペクトルアプローチ)
Graph convolution (スペクトルアプローチ)
 
全域木いろいろ
全域木いろいろ全域木いろいろ
全域木いろいろ
 
Information-Theoretic Metric Learning
Information-Theoretic Metric LearningInformation-Theoretic Metric Learning
Information-Theoretic Metric Learning
 
次世代量子情報技術 量子アニーリングが拓く新時代 -- 情報処理と物理学のハーモニー --
次世代量子情報技術 量子アニーリングが拓く新時代 -- 情報処理と物理学のハーモニー --次世代量子情報技術 量子アニーリングが拓く新時代 -- 情報処理と物理学のハーモニー --
次世代量子情報技術 量子アニーリングが拓く新時代 -- 情報処理と物理学のハーモニー --
 
パターン認識 08 09 k-近傍法 lvq
パターン認識 08 09 k-近傍法 lvqパターン認識 08 09 k-近傍法 lvq
パターン認識 08 09 k-近傍法 lvq
 
Ch2 3-informed (heuristic) search
Ch2 3-informed (heuristic) searchCh2 3-informed (heuristic) search
Ch2 3-informed (heuristic) search
 
統計的学習の基礎 5章前半(~5.6)
統計的学習の基礎 5章前半(~5.6)統計的学習の基礎 5章前半(~5.6)
統計的学習の基礎 5章前半(~5.6)
 
Air Cargo transport
 Air Cargo transport Air Cargo transport
Air Cargo transport
 
虫食算を作るアルゴリズム 公表Ver
虫食算を作るアルゴリズム 公表Ver虫食算を作るアルゴリズム 公表Ver
虫食算を作るアルゴリズム 公表Ver
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
大規模グラフアルゴリズムの最先端
大規模グラフアルゴリズムの最先端大規模グラフアルゴリズムの最先端
大規模グラフアルゴリズムの最先端
 
セミパラメトリック推論の基礎
セミパラメトリック推論の基礎セミパラメトリック推論の基礎
セミパラメトリック推論の基礎
 
Introduction to Graph and Graph Coloring
Introduction to Graph and Graph Coloring Introduction to Graph and Graph Coloring
Introduction to Graph and Graph Coloring
 
深層ガウス過程とアクセントの潜在変数表現に基づく音声合成の検討
深層ガウス過程とアクセントの潜在変数表現に基づく音声合成の検討深層ガウス過程とアクセントの潜在変数表現に基づく音声合成の検討
深層ガウス過程とアクセントの潜在変数表現に基づく音声合成の検討
 

Viewers also liked

Kmeans plusplus
Kmeans plusplusKmeans plusplus
Kmeans plusplus
Renaud Richardet
 
Clustering, k means algorithm
Clustering, k means algorithmClustering, k means algorithm
Clustering, k means algorithm
Junyoung Park
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
parry prabhu
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
Kasun Ranga Wijeweera
 
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansPRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
Shinichi Tamura
 
Kmeans
KmeansKmeans
Kmeans
Wagner
 
The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016
GloverParkGroup
 
Comprension de lectura de los mexicanos
Comprension de lectura de los mexicanosComprension de lectura de los mexicanos
Comprension de lectura de los mexicanos
Ismael Plascencia Nuñez
 
广东证券见记者发表
广东证券见记者发表广东证券见记者发表
广东证券见记者发表
hanyzeng
 
Zaragoza turismo 243
Zaragoza turismo 243Zaragoza turismo 243
Zaragoza turismo 243
Saucepolis blog & Hotel Sauce
 
Маркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовМаркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовCyril Savitsky
 
Experimental design
Experimental designExperimental design
Experimental design
Dan Toma
 
سبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحسبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحMorad Kheloufi Kheloufi
 
Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015
Delivering Happiness
 
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
School of Efficient Language Studying Lingvocat.com/ Школа результативных языков Lingvocat.com
 
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. KristofWho Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Dongheartwell Dargantes
 
Kmeans
KmeansKmeans
Kmeans
guestf0009ea
 
Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012
Trulia
 
Historia insp aurora silva
Historia insp   aurora silvaHistoria insp   aurora silva
Historia insp aurora silva
antonio leal
 

Viewers also liked (20)

Kmeans plusplus
Kmeans plusplusKmeans plusplus
Kmeans plusplus
 
Clustering, k means algorithm
Clustering, k means algorithmClustering, k means algorithm
Clustering, k means algorithm
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansPRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
 
Kmeans
KmeansKmeans
Kmeans
 
The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016
 
Comprension de lectura de los mexicanos
Comprension de lectura de los mexicanosComprension de lectura de los mexicanos
Comprension de lectura de los mexicanos
 
广东证券见记者发表
广东证券见记者发表广东证券见记者发表
广东证券见记者发表
 
 
Zaragoza turismo 243
Zaragoza turismo 243Zaragoza turismo 243
Zaragoza turismo 243
 
Маркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовМаркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентов
 
Experimental design
Experimental designExperimental design
Experimental design
 
سبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحسبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاح
 
Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015
 
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
 
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. KristofWho Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
 
Kmeans
KmeansKmeans
Kmeans
 
Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012
 
Historia insp aurora silva
Historia insp   aurora silvaHistoria insp   aurora silva
Historia insp aurora silva
 

Similar to Kmeans initialization

Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
NithyananthSengottai
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
19526YuvaKumarIrigi
 
Clustering Theory
Clustering TheoryClustering Theory
Clustering Theory
SSA KPI
 
K means-1
K means-1K means-1
K means-1
MuhammadIhsan229
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
108kaushik
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based Clustering
Pier Luca Lanzi
 
Selection K in K-means Clustering
Selection K in K-means ClusteringSelection K in K-means Clustering
Selection K in K-means Clustering
Junghoon Kim
 
Data Mining Lecture_7.pptx
Data Mining Lecture_7.pptxData Mining Lecture_7.pptx
Data Mining Lecture_7.pptx
Subrata Kumer Paul
 
K means clustering
K means clusteringK means clustering
K means clustering
Kuppusamy P
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
Darshak Mehta
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
Krish_ver2
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithms
Mark Moriarty
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
Gianmario Spacagna
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.ppt
LPrashanthi
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
rameswara reddy venkat
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clustering
Pier Luca Lanzi
 
Clustering
ClusteringClustering
Clustering
Rashmi Bhat
 
Ch12 randalgs
Ch12 randalgsCh12 randalgs
Bioalgo 2012-03-randomized
Bioalgo 2012-03-randomizedBioalgo 2012-03-randomized
Bioalgo 2012-03-randomized
BioinformaticsInstitute
 
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareData Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Mohammed Kharma
 

Similar to Kmeans initialization (20)

Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Clustering Theory
Clustering TheoryClustering Theory
Clustering Theory
 
K means-1
K means-1K means-1
K means-1
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based Clustering
 
Selection K in K-means Clustering
Selection K in K-means ClusteringSelection K in K-means Clustering
Selection K in K-means Clustering
 
Data Mining Lecture_7.pptx
Data Mining Lecture_7.pptxData Mining Lecture_7.pptx
Data Mining Lecture_7.pptx
 
K means clustering
K means clusteringK means clustering
K means clustering
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithms
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.ppt
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clustering
 
Clustering
ClusteringClustering
Clustering
 
Ch12 randalgs
Ch12 randalgsCh12 randalgs
Ch12 randalgs
 
Bioalgo 2012-03-randomized
Bioalgo 2012-03-randomizedBioalgo 2012-03-randomized
Bioalgo 2012-03-randomized
 
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareData Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
 

Recently uploaded

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 

Recently uploaded (20)

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 

Kmeans initialization

  • 1. K-Means Clustering Problem Ahmad Sabiq Febri Maspiyanti Indah Kuntum Khairina Wiwin Farhania Yonatan
  • 2. What is k-means? • To partition n objects into k clusters, based on attributes. – Objects of the same cluster are close their attributes are related to each other. – Objects of different clusters are far apart their attributes are very dissimilar.
  • 3. Algorithm • Input: n objects, k (integer k ≤ n) • Output: k clusters • Steps: 1. Select k initial centroids. 2. Calculate the distance between each object and each centroid. 3. Assign each object to the cluster with the nearest centroid. 4. Recalculate each centroid. 5. If the centroids don’t change, stop (convergence). Otherwise, back to step 2. • Complexity: O(k.n.d.total_iteration)
  • 4. Initialization • Why is it important? What does it affect? – Clustering result local optimum! – Total iteration / complexity
  • 5. Good Initialization 3 clusters with 2 iterations…
  • 6. Bad Initialization 3 clusters with 4 iterations…
  • 7. Initialization Methods 1. Random 2. Forgy 3. Macqueen 4. Kaufman
  • 8. Random • Algorithm: 1. Assigns each object to a random cluster. 2. Computes the initial centroid of each cluster.
  • 11. Random 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 12. Forgy • Algorithm: 1. Chooses k objects at random and uses them as the initial centroids.
  • 13. Forgy 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 14. MacQueen • Algorithm: 1. Chooses k objects at random and uses them as the initial centroids. 2. Assign each object to the cluster with the nearest centroid. 3. After each assignment, recalculate the centroid.
  • 15. MacQueen 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 33. Kaufman C=0 d = 24,33 D = 15,52
  • 34. Kaufman C=0 C=0 C=0 C=0 C=0
  • 35. Kaufman C=0 C=0 C=0 C=0 ∑C1 = 2,74 C=0
  • 36. Kaufman ∑C5 = 52,55 ∑C6 = 55,88 ∑C9 = 42,69 ∑C7 = 53,77 ∑C1 = 2,74 ∑C8 = 51,16 ∑C2 = 12,,21 ∑C3 = 12,36 ∑C3 = 8,38
  • 37. Kaufman ∑C5 = 52,55 ∑C6 = 55,88 ∑C9 = 42,69 ∑C7 = 53,77 ∑C1 = 2,74 ∑C8 = 51,16 ∑C2 = 12,,21 ∑C3 = 12,36 ∑C3 = 8,38
  • 38. Reference 1. J.M. Peña, J.A. Lozano, and P. Larrañaga. An Empirical Comparison of Four Initialization Methods for the K- Means Algorithm. Pattern Recognition Letters, vol. 20, pp. 1027–1040. 1999. 2. J.R. Cano, O. Cordón, F. Herrera, and L. Sánchez. A Greedy Randomized Adaptive Search Procedure Applied to the Clustering Problem as an Initialization Process Using K-Means as a Local Search Procedure. Journal of Intelligent and Fuzzy Systems, vol. 12, pp. 235 – 242. 2002. 3. L. Kaufman and P.J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley. 1990.
  • 39. Questions 1. Kenapa inisialisasi penting pada k-means? 2. Metode inisialisasi apa yang memiliki greedy choice property? 3. Jelaskan kompleksitas O(nkd) pada metode Random.