2. Introduction
A Framework for Symbol-Based Learning
Version Space Search
• The Candidate Elimination Algorithm
ID3 Decision Tree Induction Algorithm
Knowledge and Learning
• Explanation-Based Learning
Unsupervised Learning
• Conceptual clustering
Machine Learning 2
3. • Learning
• through the course of their interactions with the world
• through the experience of their own internal states and processes
• Is important for practical applications of AI
• Knowledge engineering bottleneck
• major obstacle to the widespread use of intelligent systems
• the cost and difficulty of building expert systems using traditional
knowledge acquisition techniques
• one solution
For program to begin with a minimal amount of knowledge
And learn from examples, high-level advice, own explorations of
the domain
Machine Learning 3
4. • Definition of learning
• Views of Learning
• Generalization from experience
Induction: must generalize correctly to unseen instances of domain
Inductive biases: selection criteria (must select the most effective
aspects of their experience)
• Changes in the learner
acquisition of explicitly represented domain knowledge, based on
its experience, the learner constructs or modifies expressions in a
formal language (e.g. logic).
Machine Learning 4
Any change in a system that allow it to perform better the second
time on repetition of the same task or on another task drawn form
the same population (Simon, 1983)
5. • Learning Algorithms vary in
• goals, available training data, learning strategies and knowledge
representation languages
• All algorithms learn by searching through a space of possible
concepts to find an acceptable generalization (concept space Fig. 9.5)
• Inductive learning
• learning a generalization from a set of examples
• concept learning is a typical inductive learning
infer a definition from given examples of some concept (e.g. cat, soybean
disease)
allow to correctly recognize future instances of that concept
Two algorithms: version space search and ID3
Machine Learning 5
6. • Similarity-based vs. Explanation-based
• Similarity-based (data-driven)
using no prior knowledge of the domain
rely on large numbers of examples
generalization on the basis of patterns in training data
• Explanation-based Learning(prior knowledge-driven)
using prior knowledge of the domain to guide generalization
learning by analogy and other technology that utilize prior knowledge
to learn from a limited amount of training data
Machine Learning 6
7. Supervised vs. Unsupervised
• supervised learning
learning from training instances of known classification
• unsupervised learning
learning from unclassified training data
conceptual clustering or category formation
Machine Learning 7
8. • Learning Algorithms are characterized by a general
model
• Data and goals of the learning task
• Representation Language
• A set of operations
• Concept space
• Heuristic Search
• Acquired knowledge
Machine Learning 8
10. • Data and Goals
• Type of data
positive or negative examples
Single positive example and domain specific knowledge
high-level advice (e.g. condition of loop termination)
analogies(e.g. electricity vs. water)
• Goal of Learning algorithms: acquisition of
concept, general description of a class of objects
plans
problem-solving heuristics
other forms of procedural knowledge
• Properties and quality of data
come from the outside environment (e.g. teacher)
or generated by the program itself
reliable or contain noise
well-structured or unorganized
positive and negative or only positive
Machine Learning 10
12. • Representation of learned knowledge
• concept expressions in predicate calculus
A simple formulation of the concept learning problem as conjunctive
sentences containing variables
• structured representation such as frames
• description of plans as a sequence of operations or triangle table
• representation of heuristics as problem-solving rules
Machine Learning 12
size(obj1, small) ^ color(obj1, red) ^ shape(obj1, round)
size(obj2, large) ^ color(obj2, red) ^ shape(obj2, round)
=> size(X, Y) ^ color(X, red) ^ shape(X, round)
13. • A Set of operations
• Given a set of training instances, the leaner must construct a generalization,
heuristic rule, or plan that satisfies its goal
• Requires ability to manipulate representations
• Typical operations include
generalizing or specializing symbolic expressions
adjusting the weights in a neural network
modifying the program’s representations
• Concept space
• defines a space of potential concept definitions
• complexity of potential concept space is a measure of difficulty of learning
algorithms
Machine Learning 13
14. • Heuristic Search
• Use available training data and heuristics to search efficiently
• Patrick Winston’s work on learning concepts from positive and negative
examples along with near misses (Fig. 2).
• The program learns by refining candidate description of the target concept
through generalization and specialization.
Generalization changes the candidate description to let it accommodate
new positive examples (Fig. 3)
Specialization changes the candidate description to exclude near misses
(Fig. 4)
Performance of learning algorithm is highly sensitive to the quality and
order of the training examples
Machine Learning 14
19. • Implementation of inductive learning as search through
a concept space
• Generalization operations impose an ordering on the
concepts in a space, and uses this ordering to guide the
search
• 2.1 Generalization Operators and Concept Space
• 2.2 Candidate Elimination Algorithm
Machine Learning 19
20. • Primary generalization operations used in ML
• Replacing constants with variables
color(ball, red) -> color(X, red)
• Dropping conditions from a conjunctive expression
shape(X, round) ^ size(X, small) ^ color(X, red)
-> shape(X, round) ^ color(X, red)
• Adding a disjunct to an expression
shape(X, round) ^ size(X, small) ^ color(X, red)
-> shape(X, round) ^ size(X, small) ^ (color(X, red) color(X, blue))
• Replacing a property with its parent in a class hierarchy
color(X, red)
-> color(X, primary_color) if primary_color is superclass of red
Machine Learning 20
21. • Notion of covering
• If concept P is more general than concept Q, we say that
“P covers Q”
• Color(X,Y) covers color(ball,Y), which in turn covers color(ball,red)
• Concept space
• Defines a space of potential concept definitions
• The example concept space representing the
predicate obj(Sizes, Color, Shapes) with properties and values
Sizes = {large, small}
Colors = {red, white, blue}
Shapes = {ball, brick, cube}
is presented in Fig. 5
Machine Learning 21
23. • Version space: the set of all concept descriptions consistent
with the training examples.
• Toward reducing the size of the version space as more
examples become available (Fig. 9.10)
• Specific to general search from positive examples
• General to specific search from negative examples
• Candidate elimination algorithm combines these into a bi-directional
search
• Generalize based on regularities found in the training data
• Supervised learning
Machine Learning 23
24. • The learned concept must be general enough to cover all positive
examples, also must be specific enough to exclude all negative
examples
• maximally specific generalization
• Maximally general specialization
Machine Learning 24
A concept c, is maximally specific if it covers all positive examples,
none of the negative examples, and for any concept c’, that covers
the positive examples, c c’
A concept c, is maximally general if it covers none of the negative
training instances, and for any other concept c’, that covers no
negative training instance, c c’.
30. Begin
Initialize G to the most general concept in the space;
Initialize S to the first positive training instance;
For each new positive instance p
Begin
Delete all members of G that fail to match p;
For every s in S, if s does not match p, replace s with its most specific generalizations that match p and are
more specific than some members of G;
Delete from S any hypothesis more general than some other hypothesis in S;
End;
For each new negative instance n
Begin
Delete all members of S that match n;
For each g in G that matches n, replace g with its most general specializations that do not match n and are
more general than some members of S;
Delete from G any hypothesis more specific than some other hypothesis in G;
End
Machine Learning 30
31. • Combining the two directions of search into a single
algorithm has several benefits.
• G and S sets summarizes the information in the negative and positive
training instances.
• Fig. 9 gives an abstract description of the candidate
elimination algorithm.
• “+” signs represent positive instances
• “-” signs indicate negative instances
• The search “shrinks” the outermost concept to exclude negative instances
• The search “expands” the innermost concept to include new positive
instances
Machine Learning 31
32. • An incremental nature of learning algorithm
• Accepts training instances one at a time, forming a usable, although
possibly incomplete, generalization after each example (unlike the batch
algorithm such as ID3).
• Even before the algorithm converges on a single concept, the G
and S sets provide usable constraints on that concept
• If c is the goal concept, then for all g∈G and s∈S, s≤c≤g.
• Any concept that is more general than some concept in G will cover
negative instance; any concept that is more specific than some concept in S
will fail to cover some positive instances
Machine Learning 32
33. • Problems
• combinatorics of problem space: excessive growth of search space
Useful to develop heuristics for pruning states from G and S (beam
search)
Uses an inductive bias to reduce the size of concept space
trade off between expressiveness and efficiency
• The algorithm may fail to converge because of noise or inconsistency in
training data
One solution to this problem is to maintain multiple G and S sets
• Contribution
• explication of the relationship between knowledge representation,
generalization, and search in inductive learning
Machine Learning 33