SlideShare a Scribd company logo
1 of 61
Download to read offline
R EADING SEMINAR ON CLASSICS

              PRESENTED BY
                       ´
               B EJI C E LINE

                 Article

AS 136: A K-Means Clustering Algorithm



           Suggested by C. Robert
Plan

1   I NTRODUCTION

2   AGORITHM

3   D EMONSTRATION

4   C ONVERGENCE AND T IME

5   C ONSIDERABLE PROGRESS

6   T HE LIMITS OF THE ALGORITHM

7   C ONCLUSION
Plan

1   I NTRODUCTION

2   AGORITHM

3   D EMONSTRATION

4   C ONVERGENCE AND T IME

5   C ONSIDERABLE PROGRESS

6   T HE LIMITS OF THE ALGORITHM

7   C ONCLUSION
Plan

1   I NTRODUCTION

2   AGORITHM

3   D EMONSTRATION

4   C ONVERGENCE AND T IME

5   C ONSIDERABLE PROGRESS

6   T HE LIMITS OF THE ALGORITHM

7   C ONCLUSION
Plan

1   I NTRODUCTION

2   AGORITHM

3   D EMONSTRATION

4   C ONVERGENCE AND T IME

5   C ONSIDERABLE PROGRESS

6   T HE LIMITS OF THE ALGORITHM

7   C ONCLUSION
Plan

1   I NTRODUCTION

2   AGORITHM

3   D EMONSTRATION

4   C ONVERGENCE AND T IME

5   C ONSIDERABLE PROGRESS

6   T HE LIMITS OF THE ALGORITHM

7   C ONCLUSION
Plan

1   I NTRODUCTION

2   AGORITHM

3   D EMONSTRATION

4   C ONVERGENCE AND T IME

5   C ONSIDERABLE PROGRESS

6   T HE LIMITS OF THE ALGORITHM

7   C ONCLUSION
Plan

1   I NTRODUCTION

2   AGORITHM

3   D EMONSTRATION

4   C ONVERGENCE AND T IME

5   C ONSIDERABLE PROGRESS

6   T HE LIMITS OF THE ALGORITHM

7   C ONCLUSION
I NTRODUCTION




    1   I NTRODUCTION

    2   AGORITHM

    3   D EMONSTRATION

    4   C ONVERGENCE AND T IME

    5   C ONSIDERABLE PROGRESS

    6   T HE LIMITS OF THE ALGORITHM

    7   C ONCLUSION
I NTRODUCTION




P RESENTATION OF THE ARTICLE




    AS 136: A K-Means Clustering Algorithm
           Authors: J. A Hartigan and M. A. Wong
           Source: Journal of the Roal Statistical Society. Series C
           (Applied Statistics), Vol 28, No.1 (1979), pp.100-108
           Implemented in FORTRAN
I NTRODUCTION




C LUSTERING
    Clustering is the classical problem of dividing a data sample in
    some space into a collection of disjoint groups.
I NTRODUCTION




T HE AIM OF THE ALGORITHM




      Divide M points in N dimensions into K clusters
        so that the within-cluster sum of squares is
                         minimized.
I NTRODUCTION




E XAMPLE OF APPLICATION



    The algorithm is apply even on large data sets.

    For example ranging from:
           Market segmentation
           Image processing
           Geostatistics
           ...
I NTRODUCTION




E XAMPLE OF APPLICATION
    M ARKET SEGMENTATION
    It is necessary to segment the Customer in order to know more
    precisely the needs and expectations of each group.
           M: number of people
           N: number of criterion (age, sex, social status, etc ...)
           k: number of cluster




                           F IGURE : market segmentation
AGORITHM




   1   I NTRODUCTION

   2   AGORITHM

   3   D EMONSTRATION

   4   C ONVERGENCE AND T IME

   5   C ONSIDERABLE PROGRESS

   6   T HE LIMITS OF THE ALGORITHM

   7   C ONCLUSION
AGORITHM




           F IGURE : General Schema
AGORITHM




           F IGURE : General Schema
AGORITHM




I NITIALIZATION



      1     For each point point (I=1,2,...,M) find its clostest and
            second clostest cluster centers, keep them in IC1(I) and
            IC2(I), respectively.

      2     Update the cluster centres to be the average of the points
            contained within them.

      3     Put all cluster in the live set.
AGORITHM




I NITIALIZATION



      1     For each point point (I=1,2,...,M) find its clostest and
            second clostest cluster centers, keep them in IC1(I) and
            IC2(I), respectively.

      2     Update the cluster centres to be the average of the points
            contained within them.

      3     Put all cluster in the live set.
AGORITHM




I NITIALIZATION



      1     For each point point (I=1,2,...,M) find its clostest and
            second clostest cluster centers, keep them in IC1(I) and
            IC2(I), respectively.

      2     Update the cluster centres to be the average of the points
            contained within them.

      3     Put all cluster in the live set.
AGORITHM




           F IGURE : General Schema
AGORITHM




T HE LIVE SET
    We consider each point (I=1,2,...,M), let point I be in cluster L1.

    L1 is in the live set: If the cluster is updated in the quick-transfer
    (QTRAN) stage.

    L1 is not in the live set : otherwise, and if it has not been updated
    in the last M optimal-transfer (OPTRA) steps.
AGORITHM




T HE LIVE SET
    We consider each point (I=1,2,...,M), let point I be in cluster L1.

    L1 is in the live set: If the cluster is updated in the quick-transfer
    (QTRAN) stage.

    L1 is not in the live set : otherwise, and if it has not been updated
    in the last M optimal-transfer (OPTRA) steps.
AGORITHM




T HE LIVE SET
    We consider each point (I=1,2,...,M), let point I be in cluster L1.

    L1 is in the live set: If the cluster is updated in the quick-transfer
    (QTRAN) stage.

    L1 is not in the live set : otherwise, and if it has not been updated
    in the last M optimal-transfer (OPTRA) steps.
AGORITHM




           F IGURE : General Schema
AGORITHM




O PTIMAL - TRANSFER STAGE (OPTRA)


    Compute the minimum of the quantity for all cluster L (L=L1) ,


                               NC(L) ∗ D(I, L)2
                        R2 =                                       (1)
                                 NC(L) + 1

    NC(L): The number of points in cluster L
    D(I,L): The Euclidean distance between point I and cluster L


    Let L2 be the cluster with the smallest R2
AGORITHM




O PTIMAL - TRANSFER STAGE (OPTRA)
    If:
                     NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                        ≥                                  (2)
                       NC(L1) + 1           NC(L2) + 1

            No reallocation
            L2 is the new IC2(I)

    If:
                     NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                        <                                  (3)
                       NC(L1) + 1           NC(L2) + 1

            I is allocated to cluster L2
            L1 is the new IC2(I)
            Cluster centres are updated to be the means of points
            assigned to them
            The two clusters that are involved in the transfer of point I at
            this particular step are now in the live set
AGORITHM




O PTIMAL - TRANSFER STAGE (OPTRA)
    If:
                     NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                        ≥                                  (2)
                       NC(L1) + 1           NC(L2) + 1

            No reallocation
            L2 is the new IC2(I)

    If:
                     NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                        <                                  (3)
                       NC(L1) + 1           NC(L2) + 1

            I is allocated to cluster L2
            L1 is the new IC2(I)
            Cluster centres are updated to be the means of points
            assigned to them
            The two clusters that are involved in the transfer of point I at
            this particular step are now in the live set
AGORITHM




           F IGURE : General Schema
AGORITHM




           F IGURE : General Schema
AGORITHM




           F IGURE : General Schema
AGORITHM




           F IGURE : General Schema
AGORITHM




T HE QUICK - TRANSFER STAGE (QTRAN)
    We consider each point (I=1,2,...,M) in turn
    Let L1=IC1(I) and L2=IC2(I)

    If
                    NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                       <                            (4)
                      NC(L1) + 1           NC(L2) + 1

            Then, point I remains in cluster L1

    Otherwise,
            IC1 ↔ IC2
            Update the centres of cluster L1 and L2
            Its noted that transfer took place

    This step is repeated until transfer take place in the last M
    steps.
AGORITHM




T HE QUICK - TRANSFER STAGE (QTRAN)
    We consider each point (I=1,2,...,M) in turn
    Let L1=IC1(I) and L2=IC2(I)

    If
                    NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                       <                            (4)
                      NC(L1) + 1           NC(L2) + 1

            Then, point I remains in cluster L1

    Otherwise,
            IC1 ↔ IC2
            Update the centres of cluster L1 and L2
            Its noted that transfer took place

    This step is repeated until transfer take place in the last M
    steps.
AGORITHM




T HE QUICK - TRANSFER STAGE (QTRAN)
    We consider each point (I=1,2,...,M) in turn
    Let L1=IC1(I) and L2=IC2(I)

    If
                    NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                       <                            (4)
                      NC(L1) + 1           NC(L2) + 1

            Then, point I remains in cluster L1

    Otherwise,
            IC1 ↔ IC2
            Update the centres of cluster L1 and L2
            Its noted that transfer took place

    This step is repeated until transfer take place in the last M
    steps.
AGORITHM




T HE QUICK - TRANSFER STAGE (QTRAN)
    We consider each point (I=1,2,...,M) in turn
    Let L1=IC1(I) and L2=IC2(I)

    If
                    NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                       <                            (4)
                      NC(L1) + 1           NC(L2) + 1

            Then, point I remains in cluster L1

    Otherwise,
            IC1 ↔ IC2
            Update the centres of cluster L1 and L2
            Its noted that transfer took place

    This step is repeated until transfer take place in the last M
    steps.
AGORITHM




T HE QUICK - TRANSFER STAGE (QTRAN)
    We consider each point (I=1,2,...,M) in turn
    Let L1=IC1(I) and L2=IC2(I)

    If
                    NC(L1) ∗ D(I, L1)2   NC(L2) ∗ D(I, L2)2
                                       <                            (4)
                      NC(L1) + 1           NC(L2) + 1

            Then, point I remains in cluster L1

    Otherwise,
            IC1 ↔ IC2
            Update the centres of cluster L1 and L2
            Its noted that transfer took place

    This step is repeated until transfer take place in the last M
    steps.
AGORITHM




           F IGURE : General Schema
D EMONSTRATION




    1   I NTRODUCTION

    2   AGORITHM

    3   D EMONSTRATION

    4   C ONVERGENCE AND T IME

    5   C ONSIDERABLE PROGRESS

    6   T HE LIMITS OF THE ALGORITHM

    7   C ONCLUSION
D EMONSTRATION
D EMONSTRATION
D EMONSTRATION
D EMONSTRATION
D EMONSTRATION
D EMONSTRATION
D EMONSTRATION
C ONVERGENCE AND T IME




    1   I NTRODUCTION

    2   AGORITHM

    3   D EMONSTRATION

    4   C ONVERGENCE AND T IME

    5   C ONSIDERABLE PROGRESS

    6   T HE LIMITS OF THE ALGORITHM

    7   C ONCLUSION
C ONVERGENCE AND T IME




C ONVERGENCE
    There are convergence, but the algorithm produces a clustering
    which is only locally optimal!




    F IGURE : A typical example of the k-means convergence to a local
    optima
C ONVERGENCE AND T IME




T IME


    The time i approximately equal to

                                   CMNKI

           C: depends on the speed of the computer (=2.1×10−5 sec
           for an IBM 370/158)
           M: the number of points
           N: the number of dimensions
           K: the number of clusters
           I: the number of iterations (usually less than 10)
C ONSIDERABLE PROGRESS




    1   I NTRODUCTION

    2   AGORITHM

    3   D EMONSTRATION

    4   C ONVERGENCE AND T IME

    5   C ONSIDERABLE PROGRESS

    6   T HE LIMITS OF THE ALGORITHM

    7   C ONCLUSION
C ONSIDERABLE PROGRESS




R ELATED ALGORITHM



    AS 113: A TRANSFER ALGORITHM FOR NON - HIERARCHIAL
    CLASSIFICATION
    (by Banfield and Bassil in 1977)
           It uses swops as well as transfer to try to overcome the
           problem of local optima.
           It is too expensive to use if the size of the data set is
           significant.
C ONSIDERABLE PROGRESS




R ELATED ALGORITHM

    AS 58: E UCLIDEAN CLUSTER ANALYSIS
    (by Sparks in 1973)
           It find a K-partition of the sample, with within-cluster sum of
           squares.
           Only the closest centre is used to check for possible
           reallocation of the given point, it does not provide a locally
           optimal solution.
           A saving of about 50 per cent in time occurs in the
           k-Means algorithm due to using ”live” sets and due to using
           a quick- transfer stage which reduces the number of
           optimal transfer iterations by a factor of 4.
T HE LIMITS OF THE ALGORITHM




    1   I NTRODUCTION

    2   AGORITHM

    3   D EMONSTRATION

    4   C ONVERGENCE AND T IME

    5   C ONSIDERABLE PROGRESS

    6   T HE LIMITS OF THE ALGORITHM

    7   C ONCLUSION
T HE LIMITS OF THE ALGORITHM




T HE SELECTION OF THE INITIAL CLUSTER CENTRES




       F IGURE : The choice of the initial cluster centres affect the final results
T HE LIMITS OF THE ALGORITHM




T HE SELECTION OF THE INITIAL CLUSTER CENTRES

    T HE SOLUTION TO THE PROBLEM
         Proposed in the article:K sample points are chosen as
         the initial cluster centres.
         The points are first ordered by their distances to the overall
         mean of the sample. Then, for cluster L (L = 1,2, ..., K), the
         1 + (L -1) * [M/K]th point is chosen to be its initial cluster
         centre (it is guaranteed that no cluster will be empty).

            Commonly used:The cluster centers are chosen
            randomly.
            The algorithm is run several time, then select the set of
            cluster with minimum sum of the Squared Error.
T HE LIMITS OF THE ALGORITHM




T HE SELECTION OF THE NUMBER OF CLUSTERS



    T HE SOLUTION TO THE PROBLEM : T HE MEAN SHIFT ALGORITHM
    The mean shift is similar to K-means in that it maintains a set of
    data points that are iteratively replaced by means.

    But there is no need to choose the number of clusters, because
    mean shift is likely to find only a few clusters if indeed only a
    small number exist.
T HE LIMITS OF THE ALGORITHM




S ENSITIVITY TO NOISE
    T HE SOLUTION TO THE PROBLEM :K- MEANS ALGORITHM
    CONSIDERING WEIGHTS

    This algorithm is the same as the k-mean, but each variable is
    weighted to provide a lower weight to the variables affected by high
    noise.

    The center of cluster L:
                                           WH(I)
                                C(L) =            XI                 (5)
                                           WHC(L)

    The quantity computed:

                                     WHC(L) ∗ D(I, L)2
                                R=                                   (6)
                                     WHC(L) − WH(I)

    WH(I): the weight of each point I
    WHC(L): the weight of each cluster L
C ONCLUSION




    1   I NTRODUCTION

    2   AGORITHM

    3   D EMONSTRATION

    4   C ONVERGENCE AND T IME

    5   C ONSIDERABLE PROGRESS

    6   T HE LIMITS OF THE ALGORITHM

    7   C ONCLUSION
C ONCLUSION




   C ONCLUSION
       This algorithm has been a real revolution for the clustering
       methods:It produces good results and a high speed of
       convergence.

          This algorithm possesses flaws, but many algorithms
          implemented today are arising from it.
C ONCLUSION




R EFERENCES



           J. A. Hartigan and M. A.Wong (1979) Algorithm AS 136: A
           K-Means Clustering Algorithm. Appl.Statist., 28, 100-108.

           http://home.dei.polimi.it/matteucc/
           Clustering/tutorial_html/AppletKM.html
           http:
           //en.wikipedia.org/wiki/K-means_clustering
Thank you for your attention !!!
Retour.

More Related Content

What's hot

Mining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systemsMining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systemsDr.MAYA NAYAK
 
What is Quantum Computing and Why it is Important
What is Quantum Computing and Why it is ImportantWhat is Quantum Computing and Why it is Important
What is Quantum Computing and Why it is ImportantSasha Lazarevic
 
Tele4653 l4
Tele4653 l4Tele4653 l4
Tele4653 l4Vin Voro
 
Design and Analysis of a Control System Using Root Locus and Frequency Respon...
Design and Analysis of a Control System Using Root Locus and Frequency Respon...Design and Analysis of a Control System Using Root Locus and Frequency Respon...
Design and Analysis of a Control System Using Root Locus and Frequency Respon...Umair Shahzad
 
Radio Signal Classification with Deep Neural Networks
Radio Signal Classification with Deep Neural NetworksRadio Signal Classification with Deep Neural Networks
Radio Signal Classification with Deep Neural NetworksKachi Odoemene
 
Quantization
QuantizationQuantization
Quantizationwtyru1989
 
Digital Signal Processing Tutorial:Chapt 1 signal and systems
Digital Signal Processing Tutorial:Chapt 1 signal and systemsDigital Signal Processing Tutorial:Chapt 1 signal and systems
Digital Signal Processing Tutorial:Chapt 1 signal and systemsChandrashekhar Padole
 
Cryptanalysis with a Quantum Computer - An Exposition on Shor's Factoring Alg...
Cryptanalysis with a Quantum Computer - An Exposition on Shor's Factoring Alg...Cryptanalysis with a Quantum Computer - An Exposition on Shor's Factoring Alg...
Cryptanalysis with a Quantum Computer - An Exposition on Shor's Factoring Alg...Daniel Hutama
 
Improving initial generations in pso algorithm for transportation network des...
Improving initial generations in pso algorithm for transportation network des...Improving initial generations in pso algorithm for transportation network des...
Improving initial generations in pso algorithm for transportation network des...ijcsit
 
Accelerating Machine Learning Algorithms by integrating GPUs into MapReduce C...
Accelerating Machine Learning Algorithms by integrating GPUs into MapReduce C...Accelerating Machine Learning Algorithms by integrating GPUs into MapReduce C...
Accelerating Machine Learning Algorithms by integrating GPUs into MapReduce C...sergherrero
 
Design of sampled data control systems part 2. 6th lecture
Design of sampled data control systems part 2.  6th lectureDesign of sampled data control systems part 2.  6th lecture
Design of sampled data control systems part 2. 6th lectureKhalaf Gaeid Alshammery
 
Signal Prosessing Lab Mannual
Signal Prosessing Lab Mannual Signal Prosessing Lab Mannual
Signal Prosessing Lab Mannual Jitendra Jangid
 
Design and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding AlgorithmDesign and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding Algorithmcsandit
 
Petri Nets: Properties, Analysis and Applications
Petri Nets: Properties, Analysis and ApplicationsPetri Nets: Properties, Analysis and Applications
Petri Nets: Properties, Analysis and ApplicationsDr. Mohamed Torky
 
New controllers efficient model based design method
New controllers efficient model based design methodNew controllers efficient model based design method
New controllers efficient model based design methodAlexander Decker
 

What's hot (19)

Mining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systemsMining of time series data base using fuzzy neural information systems
Mining of time series data base using fuzzy neural information systems
 
What is Quantum Computing and Why it is Important
What is Quantum Computing and Why it is ImportantWhat is Quantum Computing and Why it is Important
What is Quantum Computing and Why it is Important
 
Tele4653 l4
Tele4653 l4Tele4653 l4
Tele4653 l4
 
Design and Analysis of a Control System Using Root Locus and Frequency Respon...
Design and Analysis of a Control System Using Root Locus and Frequency Respon...Design and Analysis of a Control System Using Root Locus and Frequency Respon...
Design and Analysis of a Control System Using Root Locus and Frequency Respon...
 
An32272275
An32272275An32272275
An32272275
 
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems
Adaptive dynamic programming algorithm for uncertain nonlinear switched systemsAdaptive dynamic programming algorithm for uncertain nonlinear switched systems
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems
 
Radio Signal Classification with Deep Neural Networks
Radio Signal Classification with Deep Neural NetworksRadio Signal Classification with Deep Neural Networks
Radio Signal Classification with Deep Neural Networks
 
Quantization
QuantizationQuantization
Quantization
 
Digital Signal Processing Tutorial:Chapt 1 signal and systems
Digital Signal Processing Tutorial:Chapt 1 signal and systemsDigital Signal Processing Tutorial:Chapt 1 signal and systems
Digital Signal Processing Tutorial:Chapt 1 signal and systems
 
Ma2520962099
Ma2520962099Ma2520962099
Ma2520962099
 
Dsp2
Dsp2Dsp2
Dsp2
 
Cryptanalysis with a Quantum Computer - An Exposition on Shor's Factoring Alg...
Cryptanalysis with a Quantum Computer - An Exposition on Shor's Factoring Alg...Cryptanalysis with a Quantum Computer - An Exposition on Shor's Factoring Alg...
Cryptanalysis with a Quantum Computer - An Exposition on Shor's Factoring Alg...
 
Improving initial generations in pso algorithm for transportation network des...
Improving initial generations in pso algorithm for transportation network des...Improving initial generations in pso algorithm for transportation network des...
Improving initial generations in pso algorithm for transportation network des...
 
Accelerating Machine Learning Algorithms by integrating GPUs into MapReduce C...
Accelerating Machine Learning Algorithms by integrating GPUs into MapReduce C...Accelerating Machine Learning Algorithms by integrating GPUs into MapReduce C...
Accelerating Machine Learning Algorithms by integrating GPUs into MapReduce C...
 
Design of sampled data control systems part 2. 6th lecture
Design of sampled data control systems part 2.  6th lectureDesign of sampled data control systems part 2.  6th lecture
Design of sampled data control systems part 2. 6th lecture
 
Signal Prosessing Lab Mannual
Signal Prosessing Lab Mannual Signal Prosessing Lab Mannual
Signal Prosessing Lab Mannual
 
Design and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding AlgorithmDesign and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding Algorithm
 
Petri Nets: Properties, Analysis and Applications
Petri Nets: Properties, Analysis and ApplicationsPetri Nets: Properties, Analysis and Applications
Petri Nets: Properties, Analysis and Applications
 
New controllers efficient model based design method
New controllers efficient model based design methodNew controllers efficient model based design method
New controllers efficient model based design method
 

Similar to slides Céline Beji

Note and assignment mis3 5.3
Note and assignment mis3 5.3Note and assignment mis3 5.3
Note and assignment mis3 5.3RoshanTushar1
 
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorImplementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorPTIHPA
 
CS330-Lectures Statistics And Probability
CS330-Lectures Statistics And ProbabilityCS330-Lectures Statistics And Probability
CS330-Lectures Statistics And Probabilitybryan111472
 
7076 chapter5 slides
7076 chapter5 slides7076 chapter5 slides
7076 chapter5 slidesNguyen Mina
 
Tele4653 l3
Tele4653 l3Tele4653 l3
Tele4653 l3Vin Voro
 
Lecture 3 (ADSP).pptx
Lecture 3 (ADSP).pptxLecture 3 (ADSP).pptx
Lecture 3 (ADSP).pptxHarisMasood20
 
Pid En Un Pic16 F684
Pid En Un Pic16 F684Pid En Un Pic16 F684
Pid En Un Pic16 F684quadrocopter
 
Pid En Un Pic16 F684
Pid En Un Pic16 F684Pid En Un Pic16 F684
Pid En Un Pic16 F684quadrocopter
 
extreme times in finance heston model.ppt
extreme times in finance heston model.pptextreme times in finance heston model.ppt
extreme times in finance heston model.pptArounaGanou2
 
Case Study (All)
Case Study (All)Case Study (All)
Case Study (All)gudeyi
 
Design and Analysis of Algorithms - Divide and Conquer
Design and Analysis of Algorithms - Divide and ConquerDesign and Analysis of Algorithms - Divide and Conquer
Design and Analysis of Algorithms - Divide and ConquerSeshu Chakravarthy
 
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...Austin Benson
 
Algorithm Design and Analysis
Algorithm Design and AnalysisAlgorithm Design and Analysis
Algorithm Design and AnalysisReetesh Gupta
 

Similar to slides Céline Beji (20)

Lti system
Lti systemLti system
Lti system
 
Note and assignment mis3 5.3
Note and assignment mis3 5.3Note and assignment mis3 5.3
Note and assignment mis3 5.3
 
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorImplementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
 
CS330-Lectures Statistics And Probability
CS330-Lectures Statistics And ProbabilityCS330-Lectures Statistics And Probability
CS330-Lectures Statistics And Probability
 
Slide11 icc2015
Slide11 icc2015Slide11 icc2015
Slide11 icc2015
 
13486500-FFT.ppt
13486500-FFT.ppt13486500-FFT.ppt
13486500-FFT.ppt
 
7076 chapter5 slides
7076 chapter5 slides7076 chapter5 slides
7076 chapter5 slides
 
Tele4653 l3
Tele4653 l3Tele4653 l3
Tele4653 l3
 
Lecture 3 (ADSP).pptx
Lecture 3 (ADSP).pptxLecture 3 (ADSP).pptx
Lecture 3 (ADSP).pptx
 
Basic concepts
Basic conceptsBasic concepts
Basic concepts
 
What is I/Q phase
What is I/Q phaseWhat is I/Q phase
What is I/Q phase
 
l1.ppt
l1.pptl1.ppt
l1.ppt
 
Pid En Un Pic16 F684
Pid En Un Pic16 F684Pid En Un Pic16 F684
Pid En Un Pic16 F684
 
Pid En Un Pic16 F684
Pid En Un Pic16 F684Pid En Un Pic16 F684
Pid En Un Pic16 F684
 
extreme times in finance heston model.ppt
extreme times in finance heston model.pptextreme times in finance heston model.ppt
extreme times in finance heston model.ppt
 
Case Study (All)
Case Study (All)Case Study (All)
Case Study (All)
 
Design and Analysis of Algorithms - Divide and Conquer
Design and Analysis of Algorithms - Divide and ConquerDesign and Analysis of Algorithms - Divide and Conquer
Design and Analysis of Algorithms - Divide and Conquer
 
Unit ii
Unit iiUnit ii
Unit ii
 
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
 
Algorithm Design and Analysis
Algorithm Design and AnalysisAlgorithm Design and Analysis
Algorithm Design and Analysis
 

More from Christian Robert

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceChristian Robert
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?Christian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Christian Robert
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihoodChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 

More from Christian Robert (20)

Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 

Recently uploaded

Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 

Recently uploaded (20)

Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 

slides Céline Beji

  • 1. R EADING SEMINAR ON CLASSICS PRESENTED BY ´ B EJI C E LINE Article AS 136: A K-Means Clustering Algorithm Suggested by C. Robert
  • 2. Plan 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 3. Plan 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 4. Plan 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 5. Plan 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 6. Plan 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 7. Plan 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 8. Plan 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 9. I NTRODUCTION 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 10. I NTRODUCTION P RESENTATION OF THE ARTICLE AS 136: A K-Means Clustering Algorithm Authors: J. A Hartigan and M. A. Wong Source: Journal of the Roal Statistical Society. Series C (Applied Statistics), Vol 28, No.1 (1979), pp.100-108 Implemented in FORTRAN
  • 11. I NTRODUCTION C LUSTERING Clustering is the classical problem of dividing a data sample in some space into a collection of disjoint groups.
  • 12. I NTRODUCTION T HE AIM OF THE ALGORITHM Divide M points in N dimensions into K clusters so that the within-cluster sum of squares is minimized.
  • 13. I NTRODUCTION E XAMPLE OF APPLICATION The algorithm is apply even on large data sets. For example ranging from: Market segmentation Image processing Geostatistics ...
  • 14. I NTRODUCTION E XAMPLE OF APPLICATION M ARKET SEGMENTATION It is necessary to segment the Customer in order to know more precisely the needs and expectations of each group. M: number of people N: number of criterion (age, sex, social status, etc ...) k: number of cluster F IGURE : market segmentation
  • 15. AGORITHM 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 16. AGORITHM F IGURE : General Schema
  • 17. AGORITHM F IGURE : General Schema
  • 18. AGORITHM I NITIALIZATION 1 For each point point (I=1,2,...,M) find its clostest and second clostest cluster centers, keep them in IC1(I) and IC2(I), respectively. 2 Update the cluster centres to be the average of the points contained within them. 3 Put all cluster in the live set.
  • 19. AGORITHM I NITIALIZATION 1 For each point point (I=1,2,...,M) find its clostest and second clostest cluster centers, keep them in IC1(I) and IC2(I), respectively. 2 Update the cluster centres to be the average of the points contained within them. 3 Put all cluster in the live set.
  • 20. AGORITHM I NITIALIZATION 1 For each point point (I=1,2,...,M) find its clostest and second clostest cluster centers, keep them in IC1(I) and IC2(I), respectively. 2 Update the cluster centres to be the average of the points contained within them. 3 Put all cluster in the live set.
  • 21. AGORITHM F IGURE : General Schema
  • 22. AGORITHM T HE LIVE SET We consider each point (I=1,2,...,M), let point I be in cluster L1. L1 is in the live set: If the cluster is updated in the quick-transfer (QTRAN) stage. L1 is not in the live set : otherwise, and if it has not been updated in the last M optimal-transfer (OPTRA) steps.
  • 23. AGORITHM T HE LIVE SET We consider each point (I=1,2,...,M), let point I be in cluster L1. L1 is in the live set: If the cluster is updated in the quick-transfer (QTRAN) stage. L1 is not in the live set : otherwise, and if it has not been updated in the last M optimal-transfer (OPTRA) steps.
  • 24. AGORITHM T HE LIVE SET We consider each point (I=1,2,...,M), let point I be in cluster L1. L1 is in the live set: If the cluster is updated in the quick-transfer (QTRAN) stage. L1 is not in the live set : otherwise, and if it has not been updated in the last M optimal-transfer (OPTRA) steps.
  • 25. AGORITHM F IGURE : General Schema
  • 26. AGORITHM O PTIMAL - TRANSFER STAGE (OPTRA) Compute the minimum of the quantity for all cluster L (L=L1) , NC(L) ∗ D(I, L)2 R2 = (1) NC(L) + 1 NC(L): The number of points in cluster L D(I,L): The Euclidean distance between point I and cluster L Let L2 be the cluster with the smallest R2
  • 27. AGORITHM O PTIMAL - TRANSFER STAGE (OPTRA) If: NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 ≥ (2) NC(L1) + 1 NC(L2) + 1 No reallocation L2 is the new IC2(I) If: NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 < (3) NC(L1) + 1 NC(L2) + 1 I is allocated to cluster L2 L1 is the new IC2(I) Cluster centres are updated to be the means of points assigned to them The two clusters that are involved in the transfer of point I at this particular step are now in the live set
  • 28. AGORITHM O PTIMAL - TRANSFER STAGE (OPTRA) If: NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 ≥ (2) NC(L1) + 1 NC(L2) + 1 No reallocation L2 is the new IC2(I) If: NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 < (3) NC(L1) + 1 NC(L2) + 1 I is allocated to cluster L2 L1 is the new IC2(I) Cluster centres are updated to be the means of points assigned to them The two clusters that are involved in the transfer of point I at this particular step are now in the live set
  • 29. AGORITHM F IGURE : General Schema
  • 30. AGORITHM F IGURE : General Schema
  • 31. AGORITHM F IGURE : General Schema
  • 32. AGORITHM F IGURE : General Schema
  • 33. AGORITHM T HE QUICK - TRANSFER STAGE (QTRAN) We consider each point (I=1,2,...,M) in turn Let L1=IC1(I) and L2=IC2(I) If NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 < (4) NC(L1) + 1 NC(L2) + 1 Then, point I remains in cluster L1 Otherwise, IC1 ↔ IC2 Update the centres of cluster L1 and L2 Its noted that transfer took place This step is repeated until transfer take place in the last M steps.
  • 34. AGORITHM T HE QUICK - TRANSFER STAGE (QTRAN) We consider each point (I=1,2,...,M) in turn Let L1=IC1(I) and L2=IC2(I) If NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 < (4) NC(L1) + 1 NC(L2) + 1 Then, point I remains in cluster L1 Otherwise, IC1 ↔ IC2 Update the centres of cluster L1 and L2 Its noted that transfer took place This step is repeated until transfer take place in the last M steps.
  • 35. AGORITHM T HE QUICK - TRANSFER STAGE (QTRAN) We consider each point (I=1,2,...,M) in turn Let L1=IC1(I) and L2=IC2(I) If NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 < (4) NC(L1) + 1 NC(L2) + 1 Then, point I remains in cluster L1 Otherwise, IC1 ↔ IC2 Update the centres of cluster L1 and L2 Its noted that transfer took place This step is repeated until transfer take place in the last M steps.
  • 36. AGORITHM T HE QUICK - TRANSFER STAGE (QTRAN) We consider each point (I=1,2,...,M) in turn Let L1=IC1(I) and L2=IC2(I) If NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 < (4) NC(L1) + 1 NC(L2) + 1 Then, point I remains in cluster L1 Otherwise, IC1 ↔ IC2 Update the centres of cluster L1 and L2 Its noted that transfer took place This step is repeated until transfer take place in the last M steps.
  • 37. AGORITHM T HE QUICK - TRANSFER STAGE (QTRAN) We consider each point (I=1,2,...,M) in turn Let L1=IC1(I) and L2=IC2(I) If NC(L1) ∗ D(I, L1)2 NC(L2) ∗ D(I, L2)2 < (4) NC(L1) + 1 NC(L2) + 1 Then, point I remains in cluster L1 Otherwise, IC1 ↔ IC2 Update the centres of cluster L1 and L2 Its noted that transfer took place This step is repeated until transfer take place in the last M steps.
  • 38. AGORITHM F IGURE : General Schema
  • 39. D EMONSTRATION 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 47. C ONVERGENCE AND T IME 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 48. C ONVERGENCE AND T IME C ONVERGENCE There are convergence, but the algorithm produces a clustering which is only locally optimal! F IGURE : A typical example of the k-means convergence to a local optima
  • 49. C ONVERGENCE AND T IME T IME The time i approximately equal to CMNKI C: depends on the speed of the computer (=2.1×10−5 sec for an IBM 370/158) M: the number of points N: the number of dimensions K: the number of clusters I: the number of iterations (usually less than 10)
  • 50. C ONSIDERABLE PROGRESS 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 51. C ONSIDERABLE PROGRESS R ELATED ALGORITHM AS 113: A TRANSFER ALGORITHM FOR NON - HIERARCHIAL CLASSIFICATION (by Banfield and Bassil in 1977) It uses swops as well as transfer to try to overcome the problem of local optima. It is too expensive to use if the size of the data set is significant.
  • 52. C ONSIDERABLE PROGRESS R ELATED ALGORITHM AS 58: E UCLIDEAN CLUSTER ANALYSIS (by Sparks in 1973) It find a K-partition of the sample, with within-cluster sum of squares. Only the closest centre is used to check for possible reallocation of the given point, it does not provide a locally optimal solution. A saving of about 50 per cent in time occurs in the k-Means algorithm due to using ”live” sets and due to using a quick- transfer stage which reduces the number of optimal transfer iterations by a factor of 4.
  • 53. T HE LIMITS OF THE ALGORITHM 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 54. T HE LIMITS OF THE ALGORITHM T HE SELECTION OF THE INITIAL CLUSTER CENTRES F IGURE : The choice of the initial cluster centres affect the final results
  • 55. T HE LIMITS OF THE ALGORITHM T HE SELECTION OF THE INITIAL CLUSTER CENTRES T HE SOLUTION TO THE PROBLEM Proposed in the article:K sample points are chosen as the initial cluster centres. The points are first ordered by their distances to the overall mean of the sample. Then, for cluster L (L = 1,2, ..., K), the 1 + (L -1) * [M/K]th point is chosen to be its initial cluster centre (it is guaranteed that no cluster will be empty). Commonly used:The cluster centers are chosen randomly. The algorithm is run several time, then select the set of cluster with minimum sum of the Squared Error.
  • 56. T HE LIMITS OF THE ALGORITHM T HE SELECTION OF THE NUMBER OF CLUSTERS T HE SOLUTION TO THE PROBLEM : T HE MEAN SHIFT ALGORITHM The mean shift is similar to K-means in that it maintains a set of data points that are iteratively replaced by means. But there is no need to choose the number of clusters, because mean shift is likely to find only a few clusters if indeed only a small number exist.
  • 57. T HE LIMITS OF THE ALGORITHM S ENSITIVITY TO NOISE T HE SOLUTION TO THE PROBLEM :K- MEANS ALGORITHM CONSIDERING WEIGHTS This algorithm is the same as the k-mean, but each variable is weighted to provide a lower weight to the variables affected by high noise. The center of cluster L: WH(I) C(L) = XI (5) WHC(L) The quantity computed: WHC(L) ∗ D(I, L)2 R= (6) WHC(L) − WH(I) WH(I): the weight of each point I WHC(L): the weight of each cluster L
  • 58. C ONCLUSION 1 I NTRODUCTION 2 AGORITHM 3 D EMONSTRATION 4 C ONVERGENCE AND T IME 5 C ONSIDERABLE PROGRESS 6 T HE LIMITS OF THE ALGORITHM 7 C ONCLUSION
  • 59. C ONCLUSION C ONCLUSION This algorithm has been a real revolution for the clustering methods:It produces good results and a high speed of convergence. This algorithm possesses flaws, but many algorithms implemented today are arising from it.
  • 60. C ONCLUSION R EFERENCES J. A. Hartigan and M. A.Wong (1979) Algorithm AS 136: A K-Means Clustering Algorithm. Appl.Statist., 28, 100-108. http://home.dei.polimi.it/matteucc/ Clustering/tutorial_html/AppletKM.html http: //en.wikipedia.org/wiki/K-means_clustering
  • 61. Thank you for your attention !!! Retour.