Unsupervised Learning

Unsupervised Learning
Chap 10.6
Artificial Intelligence (Structure and strategies for complex problem
solving) Fifth Edition -George F Luger

What we will be studying.
Automated Mathematician (A M)
Conceptual Clustering
COBWEB & Structure of Taxonomic Knowledge

So what is Unsupervised Learning and how is it
different from Supervised Learning.?

Automated Mathematician (A M)
● One of the earliest successful discovery systems.
● Created by Douglas Lenat in Lisp.
● Began with the concept of set theory, operations for creating new knowledge by
modifying and combining existing concepts, and a set of heuristics.
● Limitations
○ AM discovered prime numbers and several other interesting concepts, it
failed to progress beyond elementary number theory.
○ In ability to “learn to learn”, as it did not acquire new heuristics from new
discoveries in mathematics.

Clustering
● Is the task of grouping a set of objects in such a way that objects in the same
group (called a cluster) are more similar to each other than those in other groups
(clusters).
● Its main task is exploratory data mining, and a common technique for statistical
data analysis.
● Used in many fields, including machine learning, pattern recognition image
analysis.

Clustering problem begins with
● Begins with a collection of unclassified object and means for measuring the
similarity of objects.
● The goal is to organize the objects into classes that meet the standard (such as
maximizing the similarity of object in same class).
● Two Strategies - Numeric and Agglomerative.

cont.
Clustering algo builds clusters in bottom-up approach.
● Examining all pairs of objects, selecting the pair with the highest degree of
similarity, and making that pair a cluster.
● Defining the features of clusters as some func. (such as avg.) of the features
of the component members and then replacing the component objects with
this cluster definition.
● Repeat the process on all collection of objects until all objects have been
reduced to single cluster.

So the result will be a Binary tree whose leaf nodes are instances and internal
nodes are clusters of increasing size.
We may extend the algorithm as set of symbolic (using similarity of objects).
obj1={small,red,rubber,ball}
obj2={small,blue,rubber,ball}
obj3={large,black,wooden,ball}
sim(obj1,obj2)=3/5
sim(obj1,obj3)=sim(obj2,obj3)=1/7

Conceptual Clustering(CC)
CC addresses problem by using machine learning techniques to produce a general
concept definition and applying background knowledge.
CLUSTER/2 is the best example of CC approach.

CLUSTER/2
● Cluster/2 forms k categories by constructing individual around k seed objects.
● Cluster/2 evaluates the resulting clusters, selecting new seeds and repeating the
process until quality criteria is met. The algo is defined as
○ Select k seeds from the set of observed objects. (selection is done randomly
or by some selection function).
○ For each seed, using that seed as +ve instance and all other seed as -ve
instance, produce maximally general definition that covers all +ve and -ve
instances.(may lead to multiple classificatn of nonseed obj’s.)
○ Classify all obj’s in the sample according to those descriptions. Replace each
maximally general description with a maximally specific description that
covers all obj’s in the category. This decreases likelihood that classes overlap
on unseen obj’s

cont.
○ Classes may still overlap on given obj’s. CLUSTER/2 includes algo for
adjusting overlapping definitions.
○ Using a distance metric, select closest to center of each class (distance
metric could be somewhat similar to similarity metric).
○ Using these central elements as new seeds repeat steps 1-5 till a desired
quality is met.
○ If cluster are unsatisfactory and no improvement occurs over several iteratn’s
select new seed closest to the edge, rather than those at the center.

COBWEB & struct. Of taxonomy knowledge
● COBWEB is an incremental system for hierarchical conceptual clustering.
● There are four basic operations COBWEB employs in building the classification
tree.
○ Merging Two Nodes-Merging two nodes means replacing them by a node
whose children is the union of the original nodes' sets of children and which
summarizes the attribute value distributions of all objects classified under
them.
○ Splitting a node:- A node is split by replacing it with its children.
○ Inserting a new node:- A node is created corresponding to the object being
inserted into the tree.
○ Passing an object down the hierarchy:- Effectively calling the COBWEB
algorithm on the object and the subtree rooted in the nodes.

cont.
● COBWEB performs hill-climbing search of possible taxonomies.
● Initializes taxonomies to single category. For each subsequent instance, the algo
begins with root category and moves thru the tree. At each level it evaluates the
taxonomies resulting from
○ Placing the instance in the best existing category.
○ Adding a new category containing only instance.
○ Merging of two existing categories into one & adding the instance to that
category.
○ Splitting of an existing category into two & placing the instance in the best
new resulting category.

Unsupervised Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Unsupervised Learning

Similar to Unsupervised Learning (20)

Recently uploaded

Recently uploaded (20)

Unsupervised Learning