Machine Learning: Machine Learning:

Machine Learning:
Inductive Logic Programming
Dr Valentina Plekhanova
University of Sunderland, UK

Formalisms in Inductive Learning

Learning the attribute descriptions, e.g. Decision Tree
descriptions,
Learning the first-order relational descriptions, e.g. ILP
first-

Valentina Plekhanova Machine Learning: ILP 2

ILP: a Framework

• Theoretical Setting: Inductive Logic Programming
Setting:
• Task: Concept Learning
Task:
• Methods: Inductive Leaning
Methods:
• Algorithms: FOIL
Algorithms:


1

Inductive Logic Programming

Inductive Logic Programming ILP = I ∩ LP , where I stands
for induction in machine learning, LP stands for logic
programming.


Inductive Concept Learning:
Definition
Given a set E of positive and negative examples of concept C ,
find a hypothesis H , expressed in a given concept description
language L , such that
every positive example ε ∈ Ε + is covered by H
−
no negative examples ε ∈ Ε are covered by H
( H is "complete and consistent").


Inductive Logic Programming:
a Method
Reminder: Induction means reasoning from specific
Reminder:
to general.
In the case of inductive leaning from examples, the
learner is given some examples from which general
rules or a theory underplaying the examples are
derived.
Inductive inference can involve the use of
background knowledge to construct a hypothesis
which agrees with some set of examples according to
relationship.
relationship.


2

Definitions
Clause - component of a (complex) sentence, with its own
subject and predicate.
Predicate - part of a statement which says something about
the subject ,e.g. "is short" in "Life is short".
"is short" "Life short".
Subject - (contrasted with predicate) word(s) in a sentence
about which something is predicated (contrasted with
object), e.g. Life.
Life.
Inference - process of inferring - to get conclusion or to reach
an option from facts or reasoning.


ILP
In an ILP problem the task is to define, from given examples, an
unknown relation (i.e. the target predicate) in terms of (itself
and) known relations from background knowledge.
In ILP, the training examples, the background knowledge and
examples,
the induced hypotheses are all expressed in a logic program
form, with additional restrictions imposed on each of the three
languages.
languages.
For example, training examples are typically represented as
ground facts of the target predicate, and most often background
knowledge is restricted to be of the same form.


First-Order Predicate Logic
First-
- formal framework for describing and reasoning about objects, their parts,
and relations among the objects and/or the parts. An important subset of first-
order logic is Horn clauses: grandparent (X,Y) ← parent (X, Z), parent (Z,Y)
where
grandparent (X,Y) - head of clause or postcondition,
parent (X, Z), parent (Z,Y) - body of clause or precondition,

grandparent, parent - predicates; a Literal is any predicate or its negation,
(X,Y), (X,Z), (Z,Y) - arguments,
X, Y, Z - variables,
comma between predicates means "conjunction", ← means
IF (body of clause) THEN (head of clause )


3

Reasons for ILP
Any sets of first-order Horn clauses can be interpreted as
programs in the logic programming language PROLOG, so
learning them (Horn clauses) is often called inductive logic
programming (ILP).
ILP is very convenient learning because there are two reasons:
ILP is based on the sets of IF-THEN rules (logic) - one of the
most expressive and human readable representations for learned
hypotheses.
ILP can be viewed as automatically inferring PROLOG
programs from examples. PROLOG is a programming language
in which programs are expressed as collections of Horn clauses.

ILP Problem: an Example

• Illustration of the ILP task on a simple problem of learning
family relations.
• Example – an ILP Problem
• The task is to define the target relation daughter (x,y) which
states that person x is a daughter of person y, in terms of the
background knowledge relations female and parent.
parent.
• These relations are given in the following table.
• There are two positive ⊕ and two negative Θ examples
of the target relation.


A Simple ILP Problem:
Learning the daughter Relation
Training Examples Background Knowledge Background Knowledge

daughter (mary, ann) ⊕ parent (ann, mary) female(ann)

daughter (eve, tom) parent (ann, tom) female(mary)
⊕

daughter (tom, ann)
Θ parent (tom, eve) female(eve)

daughter (eve, ann)
Θ parent (tom, ian)

In the hypothesis language of Horn clauses it is possible to formulate the
formulate
following definition of target relation:
daughter (x,y) female (x), parent (y,x)

4

INPUT: B, E+, E-, H=∅
R = p(X1,…,Xn) ← FOIL Algorithm
WHILE E+≠ ∅ DO
WHILE There are examples in E – , i.e. ∃ e ∈ E - ,
i.e. that are still covered by H∪ {R} DO
find the best Literal L (via FOIL_Gain) to add this Literal L to R
FOIL_Gain)
R =R←L
E - = E - {e ∈ E- | that does not satisfy R }
END

H = H ∪ {R}
E+ = E+ {e ∈ E + | H}
i.e. {The examples in E+ , that are covered by B, are removed}
END
OUTPUT: H


FOIL Algorithm
Consider some rule R, and a candidate literal L that might be
added to the body of R. Let R′ be the new rule created by adding
literal L to R.
The value Foil_Gain (L, R) of adding L to R is defined as
Foil_Gain (L, R)=t { log2 [p1 / (p1+n1)]-log2 [p0 / (p0+n0)] }
)]-
where p0 is the number of positive bindings of rule R, n0 is the
number of negative bindings of R, p1 and n1 - for rule R′.
t is the number of positive bindings of rule R that are still
covered after adding literal L to R. Foil_Gain value has an
interpretation in terms of information theory ( -log2[p0 /(p0+n0)]
is the entropy).

FOIL Algorithm: an Example
Rule1: daughter (X,Y) ←
T1: (Mary, Ann) + p1 = 2 L1 = female (X)
(Eve, Tom) + n1 = 2 Foil_Gain_L1 = …
Foil_Gain_L
(Tom, Ann) - t1 = … L2 = parent (Y,X)
(Eve, Ann) - Foil_Gain_L2 = …
Rule2: daughter (X,Y) ← female (X)
T2: (Mary, Ann) + p2 = 2 L2 = parent (Y,X)
(Eve, Tom) + n2 = 1 Foil_Gain_L2 = …
Foil_Gain_L
(Eve, Ann) - t2 = …
Rule3: daughter (X,Y) ← female (X), parent (Y,X)
T3: (Mary, Ann) + p3 = 2 Final rule - …?
rule
(Eve, Tom) + n3 = 0
t3 = …


5

Complete & Consistent

Prior satisfiablity: Before we take any hypothesis into account
satisfiablity:
we cannot make any conclusions with respect to the negative
examples and the background knowledge. This is needed for the
negative examples that have to conflict the background
knowledge.
Posterior satisfiability: It means a negative example cannot be
satisfiability:
derived from Hypothesis and Background Knowledge.


Complete & Consistent

Prior Necessity: Some positive examples may simply be a
Necessity:
conclusion from the background knowledge, but not all.
Posterior Sufficiency: To verify that all positive examples
Sufficiency:
are covered by the background knowledge and the
hypothesis. If the hypothesis satisfies this condition we call
it complete.


ID3 vs FOIL
ID3:
Learner learns attribute descriptions.
There are limitations, e.g. limited representational formalism.
Limited capability of taking into account the available
background knowledge.
Foil:
Object can be described structurally, i.e. in terms of their
,
components and relations among the components.
Learner learns first-order relational descriptions.
The given relations constitute the background knowledge.
.


6

Machine Learning: Machine Learning:

More Related Content

What's hot

Viewers also liked

Similar to Machine Learning: Machine Learning:

More from butest

Machine Learning: Machine Learning: