Deductive Databases
ADBMS
By:
Dabbal S. Mahara
2017
Deduction and Induction
 Deductive: towards the consequences
• All swans are white.
• Tessa is a swan.
 Tessa is white.
 Inductive: towards a generalisation of observations
• Joe and Lisa and Tex and Wili and ... (all observed swans)
are swans.
• Joe and Lisa and Tex and Wili and ... (all observed swans)
are white.
 All swans are white.
2
Deduction
 Human beings have the interesting ability to derive facts from a set
of data, even though these facts are not explicitly represented in the
data.
 That is, given appropriate information, humans can deduce new
information by applying rules.
 For example, given a list of people and their parents, we could
deduce their grandparents, great-grandparents, and so on, despite
the fact that these facts were not explicitly represented in the
original data.
 What we have done is to use a rule: “the parent of a parent is a
grandparent (and so on).”
3
Conventional Databases
 In very general terms, a conventional database consists of a
collection of facts.
 Using some form of query language, these facts may be accessed
and manipulated as required.
 However, facts must be explicitly stored in the database for them to
be of any use.
 If a fact is not explicitly represented in the database, then as far as
the database is concerned, that fact effectively does not exist.
4
Conventional Database
 To clarify this idea of implicit data representation, consider the following
example.
 Suppose we have a conventional database that contains the two facts:
• “John is Bill’s father.”
• “Bill is Derek’s father.”
 Now most people will quickly realise that John is also Derek’s grandfather
(assuming, of course, that the two Bills are the same person).
 In other words, this fact is implicit in the two existing facts. However, the
fact “John is Derek’s grandfather” is not stored in the database, so as far
as the database is concerned, there is no relationship whatsoever
between John and Derek.
5
Deductive Database
 A deductive database contains not only facts, but also general rules.
 These rules can be used to deduce new facts that are not explicitly
represented in the database; that is, data can be stored implicitly.
 Now consider the following deductive database:
“John is Bill’s father.”
“Bill is Derek’s father.”
“IF x is y’s father AND y is z’s father, THEN x is z’s grandfather.”
 In addition to the two facts, this database also contains a rule that specifies
the grandfather relationship between a pair of entities.
 The database can prove the statement “John is Derek’s grandfather” by
making the following substitution into the rule: x = “John”, y = “Bill” and z =
“Derek”.
6
Deductive Database
 The effect of rules of this form is to define new relations that are not
explicitly represented in the database.
 These are known as implicit or virtual relations. Explicit relations, like the
father relation above, are known as base relations.
 The set of virtual relations is called the intensional database (IDB), while
the set of base relations is called the extensional database (EDB).
 Deductive database systems are based on first-order logic. First-order
logic allows us to express both facts and rules about the facts using the
same syntax.
7
First Order Logic
 The first-order predicate calculus is one particular
language for expressing logical statements.
 For example, the representation of the database above in
first-order logic would be:
→ father(John, Bill)
→ father(Bill, Derek)
(∀x ∀y ∀z)(father(x, y) ∧ father(y, z) → grandfather(x, z))
where ∧, → and ∀ represent conjunction, implication and
the quantifier “for all” respectively.
8
Deductive Database
 A deductive database can be defined as an advanced database
augmented with an inference system.
 Deductive databases have grown out of the desire to combine logic
programming with relational databases to construct systems that
support a powerful formalism.
 The immediate advantage of this is that it can potentially reduce the
space needed to store data
Database + Inference
Deductive
database
9
Deductive Database
 In a deductive database system we typically specify rules through a
declarative language—a language in which we specify what to
achieve rather than how to achieve it.
 An inference engine (or deduction mechanism) within the system
can deduce new facts from the database by interpreting these
rules.
 Deductive databases have not found widespread adoptions outside
academia, but some of their concepts are used in todays relational
databases to support the advanced features of more recent SQL
standards.
10
Why Deductive DBs?
 This has the advantage that some data can be stored implicitly
using rules, rather than explicitly. This reduces the amount of
storage the database occupies.
 The use of rules also allows us to store new kinds of data, such
as recursive data.
 It incorporates the features and power of a logic programming
language (e.g. Prolog).
11
Deductive Database
 Expert systems work in a similar way to deductive
databases, using rules to deduce information.
 A deductive database is oriented towards storing and
manipulating large amounts of data, whereas an expert
system stores and manipulates large numbers of rules.
 These rules are usually expressed in the database query
language Datalog because of its simplicity and readability .
12
 The deductive database work based on logic has used Prolog as a
starting point. A variation of Prolog called Datalog is used to define rules
declaratively in conjunction with an existing set of relations, which are
themselves treated as literals in the language.
 A deductive database uses two main types of specifications: facts and
rules.
 Facts can be considered as the data stored as relations in a relational
database.Facts are specified in a manner similar to the way relations are
specified, except that it is not necessary to include the attribute names.
 Rules are somewhat similar to relational views. They specify virtual
relations that are not actually stored but that can be formed from the facts
by applying inference mechanisms based on the rule specifications.
13Specification of DDB
14
Prolog/Datalog Notation
 The notation used in Prolog/Datalog is based on providing
predicates with unique names
 A predicate has an implicit meaning, which is suggested by
the predicate name, and a fixed number of arguments
 If the arguments are all constant values, the predicate simply
states that a certain fact is true
 If the predicate has variables as arguments, it is either
considered as a query or as part of a rule or constraint
 All constant values in a predicate are either numeric or
character strings starting with lowercase letters, whereas
variable names always start with an uppercase letter
Datalog Notation
 A program is built from basic objects called atomic formulas
 Atomic formulas are literals of the form p(a1, a2, …, an), where p
is a predicate name and n is the number of arguments for
predicate p
 Different predicate symbols can have different number of
arguments, and the number of arguments n of predicate p is
sometimes called the arity or degree of p
 The arguments can be either constant values or variable
names; Constant values are either numeric or character strings
starting with lowercase letters, whereas variable names always
start with an uppercase letter
16
Prolog/Datalog Notation
Fig: (a) Prolog notation (b) The supervisory tree
Basic inference mechanism for logic
programs
17
• Interpretation of programs (rules + facts)
• There are two main alternatives for interpreting the theoretical
meaning of rules: proof theoretic, and model theoretic interpretation
• proof theoretic interpretation: In the proof-theoretic interpretation of
rules, we consider the facts and rules to be true statements, or
axioms. Ground axioms contain no variables.
1. The facts are ground axioms that are considered to be true
statements.
2. Rules are deductive axioms since they are used to construct
proofs that derive new facts from existing facts.
• The example given next shows how to prove the fact
SUPERIOR(james, ahmad)
18
Example:
1. superior(X, Y)  supervise(X, Y). (rule 1)
2. superior(X, Y)  supervise(X, Z), superior (Z, Y). (rule 2)
3. supervise(jennifer, ahmad). (ground axiom, given)
4. supervise(james, jennifer). (ground axiom, given)
5. superior(jennifer, ahmad). (apply rule 1 on 3)
6. superior(james, ahmad). (apply rule 2 on 4 and 5)
 Given a finite or an infinite domain of constant values, assign to each
predicate in the program every possible combination of values as arguments.
 We must then determine whether the predicate is true or false.
 In general, it is sufficient to specify the combinations of arguments that make
the predicate true, and to state that all other combinations make the
predicate false.
 If this is done for every predicate, it is called an interpretation of the set of
predicates.
 For example, consider the interpretation shown in Figure given next for the
predicates SUPERVISE and SUPERIOR.
 This interpretation assigns a truth value (true or false) to every possible
combination of argument values (from a finite domain) for the two predicates.
19Model theoretic interpretation
Model theoretic interpretation 20
• An interpretation is called a model for a specific set of rules if
those rules are always true under that interpretation;
• that is, for any values assigned to the variables in the rules, the
head of the rules is true when we substitute the truth values
assigned to the predicates in the body of the rule by that
interpretation.
21
Example:
1. superior(X, Y)  supervise(X, Y). (rule 1)
2. superior(X, Y)  supervise(X, Z), superior(Z, Y). (rule 2)
known facts:
supervise(franklin, john), supervise(franklin, ramesh),
supervise(franklin, joyce), supervise(james, franklin),
supervise(jennifer, alicia), supervise(jennifer, ahmad),
supervise(james, jennifer).
For all other possible (X, Y) combinations supervise(X, Y) is false.
domain = {james, franklin, john, ramesh, joyce, jennifer, alicia, ahmad}
22
• The above interpretation is also a model for the rules (1) and (2) since each
of them evaluates always to true under the interpretation.
• For example,
superior(X, Y)  supervise(X, Y)
superior(franklin, john)  supervise(franklin, john) is true.
superior(franklin, ramesh)  supervise(franklin, ramesh) is true.
... …
superior(X, Y)  supervise(X, Z), superior(Z, Y)
superior(james, ramesh)  supervise(james, franklin),
superior (franklin, ramesh) is true.
superior(james, alicia)  supervise(james, jennifer),
superior (jennifer, alicia) is true.
Inference mechanism 23
• In general, there are two approaches to evaluating logical
programs: bottom-up and top-down.
• Bottom-up mechanism (also called forward chaining and
bottom-up resolution)
1. The inference engine starts with the facts and applies the
rules to generate new facts. That is, the inference moves
forward from the facts toward the goal.
2. As facts are generated, they are checked against the
query predicate goal for a match.
24
- Example
query goal: superior(james, Y)?
rules and facts are given as above.
1. Check whether any of the existing facts directly matches the query.
2. Apply the first rule to the existing facts to generate new facts.
3. Apply the second rule to the existing facts to generate new facts.
4. As each fact is generated, it is checked for a match of the the query
goal.
5. Repeat step 1 - 4 until no more new facts can be found.
All the facts of the form: superior(james, a) are the answers.
25
Example:
1. superior(X, Y)  supervise(X, Y). (rule 1)
2. superior(X, Y)  supervise(X, Z), superior(Z, Y). (rule 2)
known facts:
supervise(franklin, john), supervise(franklin, ramesh),
supervise(franklin, joyce), supervise(james, franklin),
supervise(jennifer, alicia), supervise(jennifer, ahmad),
supervise(james, jennifer).
For all other possible (X, Y) combinations supervise(X, Y) is false.
domain = {james, franklin, john, ramesh, joyce, jennifer, alicia, ahmad}
superior(james, Y)?
applying the first rule: superior(james, franklin), superior(james, jennifer)
Y = {franklin, jennifer}
applying the second rule: Y = {John, Joyce, Ramesh, alicia, ahmad}
Top-down mechanism 26
(also called back chaining and top-down resolution)
1. The inference engine starts with the query goal and
attempts to find matches to the variables that lead to
valid facts in the database. That is, the inference moves
backward from the intended goal to determine facts that
would satisfy the goal.
2. During the course, the rules are used to generate
subgoals. The matching of these subgoals will lead to the
match of the intended goal.
27
-Example
query goal: ?-superior(james, Y)
rules and facts are given as above.
Query: ?- superior(james, Y)
Rule1: superior(james, Y) 
supervise(james, Y)
Rule2: superior(james, Y) 
supervise(james, Z),
superior(Z, Y)
supervise(james, Z)
superior(franklin, Y) superior(jennifer, Y)
Y=franklin, jennifer
Z=frankiln Z=jennifer
Rule1: superior(franklin, Y) 
supervise(franklin, Y)
Rule1: superior(jennifer, Y) 
supervise(jennifer, Y)
Y= john, ramesh, joyce Y= alicia, ahmad
28
29
Thank You !
30

Deductive databases

  • 1.
  • 2.
    Deduction and Induction Deductive: towards the consequences • All swans are white. • Tessa is a swan.  Tessa is white.  Inductive: towards a generalisation of observations • Joe and Lisa and Tex and Wili and ... (all observed swans) are swans. • Joe and Lisa and Tex and Wili and ... (all observed swans) are white.  All swans are white. 2
  • 3.
    Deduction  Human beingshave the interesting ability to derive facts from a set of data, even though these facts are not explicitly represented in the data.  That is, given appropriate information, humans can deduce new information by applying rules.  For example, given a list of people and their parents, we could deduce their grandparents, great-grandparents, and so on, despite the fact that these facts were not explicitly represented in the original data.  What we have done is to use a rule: “the parent of a parent is a grandparent (and so on).” 3
  • 4.
    Conventional Databases  Invery general terms, a conventional database consists of a collection of facts.  Using some form of query language, these facts may be accessed and manipulated as required.  However, facts must be explicitly stored in the database for them to be of any use.  If a fact is not explicitly represented in the database, then as far as the database is concerned, that fact effectively does not exist. 4
  • 5.
    Conventional Database  Toclarify this idea of implicit data representation, consider the following example.  Suppose we have a conventional database that contains the two facts: • “John is Bill’s father.” • “Bill is Derek’s father.”  Now most people will quickly realise that John is also Derek’s grandfather (assuming, of course, that the two Bills are the same person).  In other words, this fact is implicit in the two existing facts. However, the fact “John is Derek’s grandfather” is not stored in the database, so as far as the database is concerned, there is no relationship whatsoever between John and Derek. 5
  • 6.
    Deductive Database  Adeductive database contains not only facts, but also general rules.  These rules can be used to deduce new facts that are not explicitly represented in the database; that is, data can be stored implicitly.  Now consider the following deductive database: “John is Bill’s father.” “Bill is Derek’s father.” “IF x is y’s father AND y is z’s father, THEN x is z’s grandfather.”  In addition to the two facts, this database also contains a rule that specifies the grandfather relationship between a pair of entities.  The database can prove the statement “John is Derek’s grandfather” by making the following substitution into the rule: x = “John”, y = “Bill” and z = “Derek”. 6
  • 7.
    Deductive Database  Theeffect of rules of this form is to define new relations that are not explicitly represented in the database.  These are known as implicit or virtual relations. Explicit relations, like the father relation above, are known as base relations.  The set of virtual relations is called the intensional database (IDB), while the set of base relations is called the extensional database (EDB).  Deductive database systems are based on first-order logic. First-order logic allows us to express both facts and rules about the facts using the same syntax. 7
  • 8.
    First Order Logic The first-order predicate calculus is one particular language for expressing logical statements.  For example, the representation of the database above in first-order logic would be: → father(John, Bill) → father(Bill, Derek) (∀x ∀y ∀z)(father(x, y) ∧ father(y, z) → grandfather(x, z)) where ∧, → and ∀ represent conjunction, implication and the quantifier “for all” respectively. 8
  • 9.
    Deductive Database  Adeductive database can be defined as an advanced database augmented with an inference system.  Deductive databases have grown out of the desire to combine logic programming with relational databases to construct systems that support a powerful formalism.  The immediate advantage of this is that it can potentially reduce the space needed to store data Database + Inference Deductive database 9
  • 10.
    Deductive Database  Ina deductive database system we typically specify rules through a declarative language—a language in which we specify what to achieve rather than how to achieve it.  An inference engine (or deduction mechanism) within the system can deduce new facts from the database by interpreting these rules.  Deductive databases have not found widespread adoptions outside academia, but some of their concepts are used in todays relational databases to support the advanced features of more recent SQL standards. 10
  • 11.
    Why Deductive DBs? This has the advantage that some data can be stored implicitly using rules, rather than explicitly. This reduces the amount of storage the database occupies.  The use of rules also allows us to store new kinds of data, such as recursive data.  It incorporates the features and power of a logic programming language (e.g. Prolog). 11
  • 12.
    Deductive Database  Expertsystems work in a similar way to deductive databases, using rules to deduce information.  A deductive database is oriented towards storing and manipulating large amounts of data, whereas an expert system stores and manipulates large numbers of rules.  These rules are usually expressed in the database query language Datalog because of its simplicity and readability . 12
  • 13.
     The deductivedatabase work based on logic has used Prolog as a starting point. A variation of Prolog called Datalog is used to define rules declaratively in conjunction with an existing set of relations, which are themselves treated as literals in the language.  A deductive database uses two main types of specifications: facts and rules.  Facts can be considered as the data stored as relations in a relational database.Facts are specified in a manner similar to the way relations are specified, except that it is not necessary to include the attribute names.  Rules are somewhat similar to relational views. They specify virtual relations that are not actually stored but that can be formed from the facts by applying inference mechanisms based on the rule specifications. 13Specification of DDB
  • 14.
    14 Prolog/Datalog Notation  Thenotation used in Prolog/Datalog is based on providing predicates with unique names  A predicate has an implicit meaning, which is suggested by the predicate name, and a fixed number of arguments  If the arguments are all constant values, the predicate simply states that a certain fact is true  If the predicate has variables as arguments, it is either considered as a query or as part of a rule or constraint  All constant values in a predicate are either numeric or character strings starting with lowercase letters, whereas variable names always start with an uppercase letter
  • 15.
    Datalog Notation  Aprogram is built from basic objects called atomic formulas  Atomic formulas are literals of the form p(a1, a2, …, an), where p is a predicate name and n is the number of arguments for predicate p  Different predicate symbols can have different number of arguments, and the number of arguments n of predicate p is sometimes called the arity or degree of p  The arguments can be either constant values or variable names; Constant values are either numeric or character strings starting with lowercase letters, whereas variable names always start with an uppercase letter
  • 16.
    16 Prolog/Datalog Notation Fig: (a)Prolog notation (b) The supervisory tree
  • 17.
    Basic inference mechanismfor logic programs 17 • Interpretation of programs (rules + facts) • There are two main alternatives for interpreting the theoretical meaning of rules: proof theoretic, and model theoretic interpretation • proof theoretic interpretation: In the proof-theoretic interpretation of rules, we consider the facts and rules to be true statements, or axioms. Ground axioms contain no variables. 1. The facts are ground axioms that are considered to be true statements. 2. Rules are deductive axioms since they are used to construct proofs that derive new facts from existing facts. • The example given next shows how to prove the fact SUPERIOR(james, ahmad)
  • 18.
    18 Example: 1. superior(X, Y) supervise(X, Y). (rule 1) 2. superior(X, Y)  supervise(X, Z), superior (Z, Y). (rule 2) 3. supervise(jennifer, ahmad). (ground axiom, given) 4. supervise(james, jennifer). (ground axiom, given) 5. superior(jennifer, ahmad). (apply rule 1 on 3) 6. superior(james, ahmad). (apply rule 2 on 4 and 5)
  • 19.
     Given afinite or an infinite domain of constant values, assign to each predicate in the program every possible combination of values as arguments.  We must then determine whether the predicate is true or false.  In general, it is sufficient to specify the combinations of arguments that make the predicate true, and to state that all other combinations make the predicate false.  If this is done for every predicate, it is called an interpretation of the set of predicates.  For example, consider the interpretation shown in Figure given next for the predicates SUPERVISE and SUPERIOR.  This interpretation assigns a truth value (true or false) to every possible combination of argument values (from a finite domain) for the two predicates. 19Model theoretic interpretation
  • 20.
    Model theoretic interpretation20 • An interpretation is called a model for a specific set of rules if those rules are always true under that interpretation; • that is, for any values assigned to the variables in the rules, the head of the rules is true when we substitute the truth values assigned to the predicates in the body of the rule by that interpretation.
  • 21.
    21 Example: 1. superior(X, Y) supervise(X, Y). (rule 1) 2. superior(X, Y)  supervise(X, Z), superior(Z, Y). (rule 2) known facts: supervise(franklin, john), supervise(franklin, ramesh), supervise(franklin, joyce), supervise(james, franklin), supervise(jennifer, alicia), supervise(jennifer, ahmad), supervise(james, jennifer). For all other possible (X, Y) combinations supervise(X, Y) is false. domain = {james, franklin, john, ramesh, joyce, jennifer, alicia, ahmad}
  • 22.
    22 • The aboveinterpretation is also a model for the rules (1) and (2) since each of them evaluates always to true under the interpretation. • For example, superior(X, Y)  supervise(X, Y) superior(franklin, john)  supervise(franklin, john) is true. superior(franklin, ramesh)  supervise(franklin, ramesh) is true. ... … superior(X, Y)  supervise(X, Z), superior(Z, Y) superior(james, ramesh)  supervise(james, franklin), superior (franklin, ramesh) is true. superior(james, alicia)  supervise(james, jennifer), superior (jennifer, alicia) is true.
  • 23.
    Inference mechanism 23 •In general, there are two approaches to evaluating logical programs: bottom-up and top-down. • Bottom-up mechanism (also called forward chaining and bottom-up resolution) 1. The inference engine starts with the facts and applies the rules to generate new facts. That is, the inference moves forward from the facts toward the goal. 2. As facts are generated, they are checked against the query predicate goal for a match.
  • 24.
    24 - Example query goal:superior(james, Y)? rules and facts are given as above. 1. Check whether any of the existing facts directly matches the query. 2. Apply the first rule to the existing facts to generate new facts. 3. Apply the second rule to the existing facts to generate new facts. 4. As each fact is generated, it is checked for a match of the the query goal. 5. Repeat step 1 - 4 until no more new facts can be found. All the facts of the form: superior(james, a) are the answers.
  • 25.
    25 Example: 1. superior(X, Y) supervise(X, Y). (rule 1) 2. superior(X, Y)  supervise(X, Z), superior(Z, Y). (rule 2) known facts: supervise(franklin, john), supervise(franklin, ramesh), supervise(franklin, joyce), supervise(james, franklin), supervise(jennifer, alicia), supervise(jennifer, ahmad), supervise(james, jennifer). For all other possible (X, Y) combinations supervise(X, Y) is false. domain = {james, franklin, john, ramesh, joyce, jennifer, alicia, ahmad} superior(james, Y)? applying the first rule: superior(james, franklin), superior(james, jennifer) Y = {franklin, jennifer} applying the second rule: Y = {John, Joyce, Ramesh, alicia, ahmad}
  • 26.
    Top-down mechanism 26 (alsocalled back chaining and top-down resolution) 1. The inference engine starts with the query goal and attempts to find matches to the variables that lead to valid facts in the database. That is, the inference moves backward from the intended goal to determine facts that would satisfy the goal. 2. During the course, the rules are used to generate subgoals. The matching of these subgoals will lead to the match of the intended goal.
  • 27.
    27 -Example query goal: ?-superior(james,Y) rules and facts are given as above. Query: ?- superior(james, Y) Rule1: superior(james, Y)  supervise(james, Y) Rule2: superior(james, Y)  supervise(james, Z), superior(Z, Y) supervise(james, Z) superior(franklin, Y) superior(jennifer, Y) Y=franklin, jennifer Z=frankiln Z=jennifer Rule1: superior(franklin, Y)  supervise(franklin, Y) Rule1: superior(jennifer, Y)  supervise(jennifer, Y) Y= john, ramesh, joyce Y= alicia, ahmad
  • 28.
  • 29.
  • 30.