SlideShare a Scribd company logo
1 of 16
Consensus Clustering
Name: Arghadip Chakraborty
College: Netaji Subhash Engineering College
Stream: Computer Science & Engineering
Section: B Year: 3rd year Class Roll: 91
Univ. Roll: 10900117106
What is Clustering?
➢ Grouping objects in different
clusters according to their similarity.
➢ Unsupervised learning method.
➢ Eg: K-means, K-prototype, C-means,
DBSCAN etc.
So.. what is Consensus Clustering?
➢ Consensus means ‘General Agreement’.
➢ Combining multiple clusters into more stable single clusters
which are better than the input clusters.
➢ The process is done by generating consensus matrix at each level.
Workflow of Consensus Clustering
Cluster 1
Cluster 2
Cluster N
…..
Consensus
Building
[2]
But...why Consensus Clustering?
➢ Better quality and robustness of the clusters.
➢ Producing the correct number of clusters.
➢ Better handling of missing data.
➢ Individual partitions can be obtained independently.
Process of Consensus Clustering
Consensus Clustering is based on two steps:
➢ Partition Generation.
➢ Consensus Generation.
[1]
Partition Generation Process
Generating partitions by,
➢ Different subsets of attributes.
➢ Applying different clustering algorithms with different bias.
➢ Using different parameters for clustering.
➢ Using random sub-samples of dataset.
[2]
Consensus Generation Process
Consensus is generally generated using two approaches,
➢ Median partitioning based approach.
➢ Co-occurrence based approach.
○ Relabeling/Voting based method.
○ Co-association matrix based method.
○ Graph based method.
Median Partitioning approach
Given a set of partitions (P = {P1, P2,…., Pn) of all the data points and
the similarity function f (Pi, Pj), the Median Partition, Pc is the
partition that maximizes the similarity to the set.
The Similarity function depend on the agreement & disagreement of
the data points, which is measured by F-measures, Rand index etc.
[2]
Co-occurrence based approach
1. Relabeling/Voting method (Algorithm):
STEP 1: Generate the clusters.
STEP 2: Determine the correspondence with the current consensus.
STEP 3: Each instance gains certain vote from the cluster assignments.
STEP 4: Update the consensus and the cluster assignments accordingly.
[2]
Co-occurrence based approach
2. Co-association matrix method (Algorithm):
STEP 1: Generate the clusters.
STEP 2: Generate co-association matrix by the similarity of data points.
STEP 3: Apply hierarchical clustering.
STEP 4: Update the clusters.
[2]
[3]
Co-occurrence based approach
3. Graph based method (Algorithm):
STEP 1: Generate a weighted graph to represent multiple clusters.
STEP 2: Find optimal partition by minimizing the graph cut.
[4]
Real World Example
[5]
References
[1] Ensemble Clustering, Bernice Lucas, 2016.
[2] Consensus Clustering Javier Béjar URL - Spring 2020 CS - MAI.
[3] Consensus Clustering, Vega-Pons and Ruiz Shulscloper, 2011.
[4] Consensus Clustering, Fern and Brodley, 2004.
[5] Ensemble Clustering using Semidefinite Programming, Singh et, NIPS-
2007.
THANK YOU!

More Related Content

What's hot

CLIQUE Automatic subspace clustering of high dimensional data for data mining...
CLIQUE Automatic subspace clustering of high dimensional data for data mining...CLIQUE Automatic subspace clustering of high dimensional data for data mining...
CLIQUE Automatic subspace clustering of high dimensional data for data mining...
Raed Aldahdooh
 

What's hot (20)

Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)
 
K-Means, its Variants and its Applications
K-Means, its Variants and its ApplicationsK-Means, its Variants and its Applications
K-Means, its Variants and its Applications
 
DATA MINING:Clustering Types
DATA MINING:Clustering TypesDATA MINING:Clustering Types
DATA MINING:Clustering Types
 
Clustering part 1
Clustering part 1Clustering part 1
Clustering part 1
 
CLIQUE Automatic subspace clustering of high dimensional data for data mining...
CLIQUE Automatic subspace clustering of high dimensional data for data mining...CLIQUE Automatic subspace clustering of high dimensional data for data mining...
CLIQUE Automatic subspace clustering of high dimensional data for data mining...
 
Improved k-means
Improved k-meansImproved k-means
Improved k-means
 
Benchmarking Tool for Graph Algorithms
Benchmarking Tool for Graph AlgorithmsBenchmarking Tool for Graph Algorithms
Benchmarking Tool for Graph Algorithms
 
K means clustring @jax
K means clustring @jaxK means clustring @jax
K means clustring @jax
 
Graph based Clustering
Graph based ClusteringGraph based Clustering
Graph based Clustering
 
[0312] joohee
[0312] joohee[0312] joohee
[0312] joohee
 
K-Means clustring @jax
K-Means clustring @jaxK-Means clustring @jax
K-Means clustring @jax
 
Clustering in artificial intelligence
Clustering in artificial intelligence Clustering in artificial intelligence
Clustering in artificial intelligence
 
Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering Mean shift and Hierarchical clustering
Mean shift and Hierarchical clustering
 
A multilevel automatic thresholding method based on a genetic algorithm for a...
A multilevel automatic thresholding method based on a genetic algorithm for a...A multilevel automatic thresholding method based on a genetic algorithm for a...
A multilevel automatic thresholding method based on a genetic algorithm for a...
 
Clustering
ClusteringClustering
Clustering
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracy
 
Clustering
ClusteringClustering
Clustering
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Ripple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network NodesRipple Algorithm to Evaluate the Importance of Network Nodes
Ripple Algorithm to Evaluate the Importance of Network Nodes
 
K means clustering
K means clusteringK means clustering
K means clustering
 

Similar to Consensus Clustering presentation slides by Arghadip Chakraborty

CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
butest
 

Similar to Consensus Clustering presentation slides by Arghadip Chakraborty (20)

Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means Clustering
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.ppt
 
My8clst
My8clstMy8clst
My8clst
 
ML basic & clustering
ML basic & clusteringML basic & clustering
ML basic & clustering
 
50120140505013
5012014050501350120140505013
50120140505013
 
K means report
K means reportK means report
K means report
 
A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...
 
Optimising Data Using K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering AlgorithmOptimising Data Using K-Means Clustering Algorithm
Optimising Data Using K-Means Clustering Algorithm
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
 
Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...
 
Data clustering
Data clustering Data clustering
Data clustering
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data Fragments
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Second subjective assignment
Second  subjective assignmentSecond  subjective assignment
Second subjective assignment
 
Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithms
 
Cg33504508
Cg33504508Cg33504508
Cg33504508
 
Introduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering EnsembleIntroduction to Multi-Objective Clustering Ensemble
Introduction to Multi-Objective Clustering Ensemble
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 

Recently uploaded

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
pwgnohujw
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
saurabvyas476
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
siskavia95
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
varanasisatyanvesh
 

Recently uploaded (20)

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 

Consensus Clustering presentation slides by Arghadip Chakraborty

  • 1. Consensus Clustering Name: Arghadip Chakraborty College: Netaji Subhash Engineering College Stream: Computer Science & Engineering Section: B Year: 3rd year Class Roll: 91 Univ. Roll: 10900117106
  • 2. What is Clustering? ➢ Grouping objects in different clusters according to their similarity. ➢ Unsupervised learning method. ➢ Eg: K-means, K-prototype, C-means, DBSCAN etc.
  • 3. So.. what is Consensus Clustering? ➢ Consensus means ‘General Agreement’. ➢ Combining multiple clusters into more stable single clusters which are better than the input clusters. ➢ The process is done by generating consensus matrix at each level.
  • 4. Workflow of Consensus Clustering Cluster 1 Cluster 2 Cluster N ….. Consensus Building [2]
  • 5. But...why Consensus Clustering? ➢ Better quality and robustness of the clusters. ➢ Producing the correct number of clusters. ➢ Better handling of missing data. ➢ Individual partitions can be obtained independently.
  • 6. Process of Consensus Clustering Consensus Clustering is based on two steps: ➢ Partition Generation. ➢ Consensus Generation. [1]
  • 7. Partition Generation Process Generating partitions by, ➢ Different subsets of attributes. ➢ Applying different clustering algorithms with different bias. ➢ Using different parameters for clustering. ➢ Using random sub-samples of dataset. [2]
  • 8. Consensus Generation Process Consensus is generally generated using two approaches, ➢ Median partitioning based approach. ➢ Co-occurrence based approach. ○ Relabeling/Voting based method. ○ Co-association matrix based method. ○ Graph based method.
  • 9. Median Partitioning approach Given a set of partitions (P = {P1, P2,…., Pn) of all the data points and the similarity function f (Pi, Pj), the Median Partition, Pc is the partition that maximizes the similarity to the set. The Similarity function depend on the agreement & disagreement of the data points, which is measured by F-measures, Rand index etc. [2]
  • 10. Co-occurrence based approach 1. Relabeling/Voting method (Algorithm): STEP 1: Generate the clusters. STEP 2: Determine the correspondence with the current consensus. STEP 3: Each instance gains certain vote from the cluster assignments. STEP 4: Update the consensus and the cluster assignments accordingly. [2]
  • 11. Co-occurrence based approach 2. Co-association matrix method (Algorithm): STEP 1: Generate the clusters. STEP 2: Generate co-association matrix by the similarity of data points. STEP 3: Apply hierarchical clustering. STEP 4: Update the clusters. [2]
  • 12. [3]
  • 13. Co-occurrence based approach 3. Graph based method (Algorithm): STEP 1: Generate a weighted graph to represent multiple clusters. STEP 2: Find optimal partition by minimizing the graph cut. [4]
  • 15. References [1] Ensemble Clustering, Bernice Lucas, 2016. [2] Consensus Clustering Javier Béjar URL - Spring 2020 CS - MAI. [3] Consensus Clustering, Vega-Pons and Ruiz Shulscloper, 2011. [4] Consensus Clustering, Fern and Brodley, 2004. [5] Ensemble Clustering using Semidefinite Programming, Singh et, NIPS- 2007.