SlideShare a Scribd company logo
1 of 12
Cluster	
  Forests	
  
Presented	
  by:	
  Romit	
  Singhai	
  
	
  
References	
  
Cluster	
  Forests	
  
Donghui	
  Yan	
  
	
  Department	
  of	
  Sta=s=cs	
  
University	
  of	
  California,	
  Berkeley	
  
Aiyou	
  Chen	
  (Google)	
  
Michael	
  I.	
  Jordan	
  (U.C.	
  Berkeley)	
  
Overview	
  
•  Clustering	
  aims	
  to	
  par==on	
  a	
  set	
  of	
  data	
  such	
  
that	
  points	
  are	
  “similar”	
  within	
  the	
  same	
  
cluster	
  while	
  “dissimilar”	
  across	
  clusters.	
  
v One	
  of	
  the	
  fundamental	
  task	
  in	
  machine	
  learning	
  
and	
  paOern	
  classifica=on	
  
v Applicable	
  in	
  wide	
  scien=fic	
  and	
  business	
  domains	
  
Challenges	
  
•  Modern	
  data	
  	
  has	
  addi=onal	
  challenges	
  
v High	
  dimensionality	
  
v Huge	
  number	
  of	
  observa=ons	
  
v Increasingly	
  complex	
  
Mo=va=on	
  
•  Ensemble	
  to	
  achieve	
  best	
  performance	
  
•  Can	
  we	
  develop	
  a	
  clustering	
  analogy	
  to	
  RF?	
  
•  Unifying	
  view	
  of	
  clustering	
  and	
  classifica=on	
  
General	
  Approach	
  
Cluster	
  ensemble	
  methods	
  generally	
  consist	
  of	
  
two	
  stages	
  
v Genera=on	
  of	
  clustering	
  instances	
  
v Aggrega=on	
  of	
  mul=ple	
  clustering	
  instances	
  
	
  
Algorithmic	
  descrip=on	
  of	
  CF	
  
Experiments	
  
Dataset	
   #	
  Features	
   #	
  Classes	
  
Soybean	
   35	
   4	
  
ImageSeg	
   19	
   7	
  
SPECT	
   22	
   2	
  
Heart	
   13	
   2	
  
Wine	
   13	
   3	
  
WDBC	
   30	
   2	
  
Robot	
   164	
   5	
  
Madelon	
   500	
   2	
  
Performance	
  Metrics	
  
•  Propor=on	
  of	
  pairs	
  of	
  points	
  with	
  “correct”	
  co-­‐
cluster	
  membership	
  
Pr	
  =	
  (#	
  correctly	
  clustered	
  pairs/Total	
  #	
  pairs)	
  X	
  100	
  
%	
  
•  Clustering	
  accuracy	
  	
  
Pc	
  =	
  (#	
  points	
  with	
  “correct”	
  cluster	
  membership/
Total	
  #	
  points)	
  X	
  100	
  %	
  
	
  
•  Assume	
  availability	
  of	
  “true”	
  labels	
  for	
  the	
  
datasets	
  
Results	
  under	
  Pr	
  
Experiments	
  on	
  eight	
  UC	
  Irvine	
  datasets	
  
Results	
  under	
  Pc	
  
Conclusions	
  
•  CF	
  is	
  cluster	
  ensemble	
  method	
  that	
  
incorporates	
  model	
  selec=on	
  
•  Good	
  empirical	
  performance	
  

More Related Content

Similar to Cluster Forest

Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data MiningKai Koenig
 
A new development in the hierarchical clustering of repertory grid data
A new development in the hierarchical clustering of repertory grid dataA new development in the hierarchical clustering of repertory grid data
A new development in the hierarchical clustering of repertory grid dataMark Heckmann
 
Data mining with Weka
Data mining with WekaData mining with Weka
Data mining with WekaAlbanLevy
 
Presentation
PresentationPresentation
Presentationbutest
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balanceAlex Henderson
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5Roger Barga
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...cvpaper. challenge
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Dalei Li
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingTed Xiao
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...butest
 
powerpoint feb
powerpoint febpowerpoint feb
powerpoint febimu409
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringAllenWu
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Seminar Slides
Seminar SlidesSeminar Slides
Seminar Slidespannicle
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..butest
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 

Similar to Cluster Forest (20)

Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Exposé Ontology
Exposé OntologyExposé Ontology
Exposé Ontology
 
A new development in the hierarchical clustering of repertory grid data
A new development in the hierarchical clustering of repertory grid dataA new development in the hierarchical clustering of repertory grid data
A new development in the hierarchical clustering of repertory grid data
 
Mattar_PhD_Thesis
Mattar_PhD_ThesisMattar_PhD_Thesis
Mattar_PhD_Thesis
 
Data mining with Weka
Data mining with WekaData mining with Weka
Data mining with Weka
 
Presentation
PresentationPresentation
Presentation
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to Stacking
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...
 
powerpoint feb
powerpoint febpowerpoint feb
powerpoint feb
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Seminar Slides
Seminar SlidesSeminar Slides
Seminar Slides
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 

Recently uploaded

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 

Cluster Forest

  • 1. Cluster  Forests   Presented  by:  Romit  Singhai    
  • 2. References   Cluster  Forests   Donghui  Yan    Department  of  Sta=s=cs   University  of  California,  Berkeley   Aiyou  Chen  (Google)   Michael  I.  Jordan  (U.C.  Berkeley)  
  • 3. Overview   •  Clustering  aims  to  par==on  a  set  of  data  such   that  points  are  “similar”  within  the  same   cluster  while  “dissimilar”  across  clusters.   v One  of  the  fundamental  task  in  machine  learning   and  paOern  classifica=on   v Applicable  in  wide  scien=fic  and  business  domains  
  • 4. Challenges   •  Modern  data    has  addi=onal  challenges   v High  dimensionality   v Huge  number  of  observa=ons   v Increasingly  complex  
  • 5. Mo=va=on   •  Ensemble  to  achieve  best  performance   •  Can  we  develop  a  clustering  analogy  to  RF?   •  Unifying  view  of  clustering  and  classifica=on  
  • 6. General  Approach   Cluster  ensemble  methods  generally  consist  of   two  stages   v Genera=on  of  clustering  instances   v Aggrega=on  of  mul=ple  clustering  instances    
  • 8. Experiments   Dataset   #  Features   #  Classes   Soybean   35   4   ImageSeg   19   7   SPECT   22   2   Heart   13   2   Wine   13   3   WDBC   30   2   Robot   164   5   Madelon   500   2  
  • 9. Performance  Metrics   •  Propor=on  of  pairs  of  points  with  “correct”  co-­‐ cluster  membership   Pr  =  (#  correctly  clustered  pairs/Total  #  pairs)  X  100   %   •  Clustering  accuracy     Pc  =  (#  points  with  “correct”  cluster  membership/ Total  #  points)  X  100  %     •  Assume  availability  of  “true”  labels  for  the   datasets  
  • 10. Results  under  Pr   Experiments  on  eight  UC  Irvine  datasets  
  • 12. Conclusions   •  CF  is  cluster  ensemble  method  that   incorporates  model  selec=on   •  Good  empirical  performance