SlideShare a Scribd company logo
1 of 18
Incremental Conceptual
Clustering
Kalpa Gunaratna
Reading group discussions @Kno.e.sis
Based on Fisher’s Cobweb algorithm
Clustering *
• Clustering is the unsupervised classification of patterns into groups.
* Jain, Anil K., M. Narasimha Murty, and Patrick J. Flynn. "Data clustering: a review." ACM computing surveys (CSUR) 31, no. 3 (1999): 264-323.
2
3
Focus on hierarchical clustering
• Single link clustering
The distance between two clusters is the minimum of the distances between all
pairs of patterns drawn from the two clusters.
In other words, evaluates dissimilarity between two clusters as the dissimilarity
of the nearest patterns, one from each cluster.
• Complete link clustering
The distance between two clusters is the maximum of all pairs between the two
clusters.
In other words, evaluates dissimilarity between two clusters as the greatest
distance between any two patterns, one from each cluster.
• Produces compact clusters.
4
• Single link algorithm can extract concentric clusters as shown below
whereas complete link cannot.
5
• But single link algorithm suffers from chaining effect as shown below whereas
complete link does not have this effect. Therefore, researchers believe complete
link gives more useful clusters in real problems.
6
• Dendrogram
7
Our focus – Incremental Conceptual Clustering
(Cobweb) 1, 2
Given a set of observations, humans acquire concepts that organize
those observations and use them in classifying future experiences. This
type of concept formation can occur in the absence of a tutor and it can
take place despite irrelevant and incomplete information.
81. Fisher, Douglas H. "Knowledge acquisition via incremental conceptual clustering." Machine learning 2, no. 2 (1987): 139-172.
2. Gennari, John H., Pat Langley, and Doug Fisher. "Models of incremental concept formation." Artificial intelligence 40, no. 1 (1989): 11-61.
• Cobweb
• Uses a hill climbing search strategy having operators enabling bi-directional
travel in the space.
• Hill climbing is a classic AI search method in which one applies all operator instantiations,
compares the resulting states using an evaluation function, selects the best state, and
iterates until no more progress can be made.
• Has a function called Category Utility to decide on what action to take in the
hill climbing search.
• Computes similarity within clusters and dissimilarity between clusters.
9
• Category utility function
• Intra-class similarity is measured by P(Ai=Vij/Ck). - predictability
• The larger this probability, the greater the proportion of class members sharing the value
and the more predictable the value is of class members.
• Inter-class similarity is measured by P(Ck/Ai=Vij). - predictiveness
• The larger this probability, the fewer the objects in contrasting classes that share this
value and the more predictive the value is of the class.
10
𝑘 𝑖 𝑗
𝑃 𝐴𝑖 = 𝑉𝑖𝑗 𝑃 𝐶 𝑘/𝐴𝑖 = 𝑉𝑖𝑗 𝑃 𝐴𝑖 = 𝑉𝑖𝑗/𝐶 𝑘
Using Bayes’ theorem
𝑘
𝑃(𝐶𝑘)
𝑖 𝑗
𝑃 𝐴𝑖 = 𝑉𝑖𝑗/𝐶𝑘 2
This is the expected number of attribute values that one
can correctly guess for an arbitrary member of class Ck.
11
• They further went on to say that CU as the increase in the expected
number of attribute values that can be correctly guessed, given a set
of n categories, over the expected number of correct guesses without
such knowledge.
• Divided by K so that merging, splitting, or adding nodes is taken care
of (will discuss now).
12
• There are four main operators in creating the hierarchy.
• Classify into an existing class.
• Create a new class.
• Combine two classes into one (merging).
• Divide a class into several classes (splitting).
• Because of the last two operations, this is normally not sensitive to
the order of items to be clustered.
13
• Merging
14
• Splitting
15
16
17
• Positive points about Incremental Conceptual Clustering (as I see)
• Unsupervised
• Input order does not matter
• Efficient – does not compute similarity/dissimilarity between all
pairs/combinations
• Good for dynamic environments
• Bi-directional search space walk in the hierarchy construction
• Try to mimic human categorization behavior
• Clustering is based on probability – not just a similarity score
18

More Related Content

Similar to Incremental concpetual clustering - reading group discussion

Unsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptxUnsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptxjasontseng19
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3Nandhini S
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)Pravinkumar Landge
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
Hierarchical clustering machine learning by arpit_sharma
Hierarchical clustering  machine learning by arpit_sharmaHierarchical clustering  machine learning by arpit_sharma
Hierarchical clustering machine learning by arpit_sharmaEr. Arpit Sharma
 
Enhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online DataEnhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online DataIOSR Journals
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
 
Could a Data Science Program use Data Science Insights?
Could a Data Science Program use Data Science Insights?Could a Data Science Program use Data Science Insights?
Could a Data Science Program use Data Science Insights?Zachary Thomas
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithmLaura Petrosanu
 
Data Mining: Cluster Analysis
Data Mining: Cluster AnalysisData Mining: Cluster Analysis
Data Mining: Cluster AnalysisSuman Mia
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningPyingkodi Maran
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.pptvikassingh569137
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptionsrefedey275
 
Density based Clustering Algorithms(DB SCAN, Mean shift )
Density based Clustering Algorithms(DB SCAN, Mean shift )Density based Clustering Algorithms(DB SCAN, Mean shift )
Density based Clustering Algorithms(DB SCAN, Mean shift )Utkarsh Sharma
 

Similar to Incremental concpetual clustering - reading group discussion (20)

Unsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptxUnsupervised Learning-Clustering Algorithms.pptx
Unsupervised Learning-Clustering Algorithms.pptx
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Clustering.pdf
Clustering.pdfClustering.pdf
Clustering.pdf
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Hierarchical clustering machine learning by arpit_sharma
Hierarchical clustering  machine learning by arpit_sharmaHierarchical clustering  machine learning by arpit_sharma
Hierarchical clustering machine learning by arpit_sharma
 
Enhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online DataEnhanced Clustering Algorithm for Processing Online Data
Enhanced Clustering Algorithm for Processing Online Data
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Could a Data Science Program use Data Science Insights?
Could a Data Science Program use Data Science Insights?Could a Data Science Program use Data Science Insights?
Could a Data Science Program use Data Science Insights?
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm
 
Data Mining: Cluster Analysis
Data Mining: Cluster AnalysisData Mining: Cluster Analysis
Data Mining: Cluster Analysis
 
k-mean-clustering.pdf
k-mean-clustering.pdfk-mean-clustering.pdf
k-mean-clustering.pdf
 
Clustering.pdf
Clustering.pdfClustering.pdf
Clustering.pdf
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine Learning
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
A0310112
A0310112A0310112
A0310112
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptions
 
Density based Clustering Algorithms(DB SCAN, Mean shift )
Density based Clustering Algorithms(DB SCAN, Mean shift )Density based Clustering Algorithms(DB SCAN, Mean shift )
Density based Clustering Algorithms(DB SCAN, Mean shift )
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis Presentation
 

Recently uploaded

CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Recently uploaded (20)

CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Incremental concpetual clustering - reading group discussion

  • 1. Incremental Conceptual Clustering Kalpa Gunaratna Reading group discussions @Kno.e.sis Based on Fisher’s Cobweb algorithm
  • 2. Clustering * • Clustering is the unsupervised classification of patterns into groups. * Jain, Anil K., M. Narasimha Murty, and Patrick J. Flynn. "Data clustering: a review." ACM computing surveys (CSUR) 31, no. 3 (1999): 264-323. 2
  • 3. 3
  • 4. Focus on hierarchical clustering • Single link clustering The distance between two clusters is the minimum of the distances between all pairs of patterns drawn from the two clusters. In other words, evaluates dissimilarity between two clusters as the dissimilarity of the nearest patterns, one from each cluster. • Complete link clustering The distance between two clusters is the maximum of all pairs between the two clusters. In other words, evaluates dissimilarity between two clusters as the greatest distance between any two patterns, one from each cluster. • Produces compact clusters. 4
  • 5. • Single link algorithm can extract concentric clusters as shown below whereas complete link cannot. 5
  • 6. • But single link algorithm suffers from chaining effect as shown below whereas complete link does not have this effect. Therefore, researchers believe complete link gives more useful clusters in real problems. 6
  • 8. Our focus – Incremental Conceptual Clustering (Cobweb) 1, 2 Given a set of observations, humans acquire concepts that organize those observations and use them in classifying future experiences. This type of concept formation can occur in the absence of a tutor and it can take place despite irrelevant and incomplete information. 81. Fisher, Douglas H. "Knowledge acquisition via incremental conceptual clustering." Machine learning 2, no. 2 (1987): 139-172. 2. Gennari, John H., Pat Langley, and Doug Fisher. "Models of incremental concept formation." Artificial intelligence 40, no. 1 (1989): 11-61.
  • 9. • Cobweb • Uses a hill climbing search strategy having operators enabling bi-directional travel in the space. • Hill climbing is a classic AI search method in which one applies all operator instantiations, compares the resulting states using an evaluation function, selects the best state, and iterates until no more progress can be made. • Has a function called Category Utility to decide on what action to take in the hill climbing search. • Computes similarity within clusters and dissimilarity between clusters. 9
  • 10. • Category utility function • Intra-class similarity is measured by P(Ai=Vij/Ck). - predictability • The larger this probability, the greater the proportion of class members sharing the value and the more predictable the value is of class members. • Inter-class similarity is measured by P(Ck/Ai=Vij). - predictiveness • The larger this probability, the fewer the objects in contrasting classes that share this value and the more predictive the value is of the class. 10
  • 11. 𝑘 𝑖 𝑗 𝑃 𝐴𝑖 = 𝑉𝑖𝑗 𝑃 𝐶 𝑘/𝐴𝑖 = 𝑉𝑖𝑗 𝑃 𝐴𝑖 = 𝑉𝑖𝑗/𝐶 𝑘 Using Bayes’ theorem 𝑘 𝑃(𝐶𝑘) 𝑖 𝑗 𝑃 𝐴𝑖 = 𝑉𝑖𝑗/𝐶𝑘 2 This is the expected number of attribute values that one can correctly guess for an arbitrary member of class Ck. 11
  • 12. • They further went on to say that CU as the increase in the expected number of attribute values that can be correctly guessed, given a set of n categories, over the expected number of correct guesses without such knowledge. • Divided by K so that merging, splitting, or adding nodes is taken care of (will discuss now). 12
  • 13. • There are four main operators in creating the hierarchy. • Classify into an existing class. • Create a new class. • Combine two classes into one (merging). • Divide a class into several classes (splitting). • Because of the last two operations, this is normally not sensitive to the order of items to be clustered. 13
  • 16. 16
  • 17. 17
  • 18. • Positive points about Incremental Conceptual Clustering (as I see) • Unsupervised • Input order does not matter • Efficient – does not compute similarity/dissimilarity between all pairs/combinations • Good for dynamic environments • Bi-directional search space walk in the hierarchy construction • Try to mimic human categorization behavior • Clustering is based on probability – not just a similarity score 18