Knowledge representation in artificial intelligence (AI) is a fundamental concept that involves the process of structuring and encoding knowledge so that AI systems can understand, reason, and make decisions. Effective knowledge representation is essential for AI systems to model and work with complex real-world information. Here are some key aspects of knowledge representation in AI:
Symbolic Knowledge Representation: This approach uses symbols and rules to represent knowledge. It involves encoding information using symbols, predicates, and logical statements. Common formalisms include first-order logic and propositional logic. Symbolic representation is particularly suited for knowledge-based systems and expert systems.
Semantic Networks: In a semantic network, knowledge is represented using nodes and links to denote relationships between concepts. This form of representation is intuitive and is often used for organizing knowledge in a structured manner.
Frames and Ontologies: Frames and ontologies are used to represent knowledge by structuring information into frames or classes. Frames contain attributes and values, and they help in organizing and categorizing knowledge. Ontologies, such as OWL (Web Ontology Language), provide a more formal representation of knowledge for use in the semantic web and knowledge graphs.
Rule-Based Systems: Rule-based systems use a set of rules to represent and reason with knowledge. These rules can be encoded in the form of "if-then" statements, allowing AI systems to make decisions and draw inferences.
Fuzzy Logic: Fuzzy logic allows for the representation of uncertainty and vagueness in knowledge. It is particularly useful in situations where information is not black and white but falls within degrees of truth.
Bayesian Networks: Bayesian networks represent knowledge using probability distributions and conditional dependencies. They are valuable for modeling uncertain or probabilistic relationships in various domains, such as medical diagnosis and risk analysis.
Connectionist Models: Connectionist models, like neural networks, use distributed representations to encode knowledge. In these models, knowledge is spread across interconnected nodes (neurons), and learning occurs through the adjustment of connection weights. These networks are particularly effective in tasks such as pattern recognition and natural language processing.
Hybrid Approaches: Many AI systems use a combination of different knowledge representation techniques to address the complexities of real-world problems. For instance, combining symbolic representation with connectionist models is a common approach in modern AI.
The choice of knowledge representation method depends on the specific problem domain, the nature of the data, and the requirements of the AI system.
Foundations of Knowledge Representation in Artificial Intelligence.pptx
1. Foundations of Knowledge
Representation in Artificial Intelligence
KNOWLEDGE REPRESENTATION
1. First Order Predicate Logic
2. Knowledge engineering in first order logic
3. Inference in First order logic
4. Prolog Programming
5. Unification and lifting
Dr.J.SENTHILKUMAR
Assistant Professor
Department of Computer Science and Engineering
KIT-KALAIGNARKARUNANIDHI INSTITUTE OF TECHNOLOGY
2. Knowledge Representation
• Knowledge Representation and Reasoning represents information from the real world
for a computer to understand and then utilize this knowledge to solve complex real-
life problems like communicating with human begins in natural language.
• Types of Knowledge:
• Declarative Knowledge
• Structural Knowledge
• Procedural Knowledge
• Meta Knowledge
• Heuristic Knowledge
4. Techniques
• There are four techniques in Knowledge Representation
• Logical Representation
• Semantic Network Representation
• Frame Representation
• Production Rule
5. • Logical Representation
• It is a language with some definite rules which deal with
propositions and has no ambiguity in representation.
• It represents a conclusion based on various conditions and lays
down some important communications rules.
• it consists of precisely define as
• Syntax and Semantics
6. • Logical Representation
• Which supports the sound inference each sentence can be
translated into logics using syntax and semantics.
• Syntax: Well formed sentence in the language.
• Semantics: the truth of meaning of sentences in the world.
• There are type of Logical Representation
• Propositional Logic
• First Order Logic
7. •The primary difference between propositional and
first-order logic lies in the ontological
commitment made by each language—that is,
what it assumes about the nature of reality.
•Mathematically, this commitment is expressed
through the nature of the formal models with
respect to which the truth of sentences is defined.
8. • For example, propositional logic assumes that there are facts
that either hold or do not hold in the world. Each fact can be in
one of two states: true or false, and each model assigns true or
false to each proposition symbol.
• First-order logic assumes more; namely, that the world consists
of objects with certain relations among them that do or do not
hold. The formal models are correspondingly more complicated
than those for propositional logic.
9. •Special-purpose logics make still further ontological
commitments; for example, temporal logic assumes that facts
hold at particular times and that those times (which may be
points or intervals) are ordered.
•Thus, special-purpose logics give certain kinds of objects (and
the axioms about them) “first class” status within the logic,
rather than simply defining them within the knowledge base.
10. •Higher-order logic views the relations and functions referred
to by first-order logic as objects in themselves. This allows one
to make assertions about all relations—for example, one could
wish to define what it means for a relation to be transitive.
•Unlike most special-purpose logics, higher-order logic is
strictly more expressive than first-order logic, in the sense that
some sentences of higher-order logic cannot be expressed by
any finite number of first-order logic sentences.
11. • A logic can also be characterized by its epistemological commitments—the
possible states of knowledge that it allows with respect to each fact.
• In both propositional and firstorder logic, a sentence represents a fact and
the agent either believes the sentence to be true, believes it to be false, or
has no opinion.
• Systems using probability theory, on the other hand, can have any degree
of belief, ranging from 0 (total disbelief) to 1 (total belief).3 For example, a
probabilistic wumpus-world agent might believe that the wumpus is in [1,3]
with probability 0.75. The ontological and epistemological commitments of
five different logics
12. Propositional Logic
• Propositional Logic is the simple logic.
• A Propositional is a declarative statement that either True or false.
• Propositional Logic cannot predicate it can say either true or false.
• Logical Connectives
13.
14. A- it is hot
B- it is humid
C- it is raining
Conditions : 1. If it is humid, then it is hot => B->A
2. If it is hot and humid then it is not raining => A ^ B -> 7C
So Proposition is a statement of a fact.
Limitations of propositional logic:
• We cannot represent relations like all, some of none with propositional logic.
• All the girls are intelligence
• Some apples are sweet.
• Propositional Logic has limited expressive power
• In PL we cannot describe statements in terms of their properties of logical relationships
15. • First Order Logic :
• FOL is another way of knowledge representation in A.I.
• It is an extension to PL
• FOL is also know as predicate logic. It is a powerful language that develops
information about the objects in a mode easy way and can also express the
relationship between those objects.
• FOL does not only assume that the world contain facts like PL but also
assumes.
• Objects
• Relations – It can be unary relation such as n-any relation such as the sister of,
brother of, has color
• Functions – Father of, best friend, end of..
• As a natural language fol has two main parts:
• Syntax
• Semantics
16.
17.
18. Symbols and interpretations:
• The basic syntactic elements of first-order logic are the
symbols that stand for objects, relations, and functions. The
symbols, therefore, come in three kinds: constant symbols,
which stand for objects; predicate symbols, which stand for
relations; and function symbols, which stand for functions.
• An interpretation that specifies exactly which objects, relations
and functions are referred to by the constant, predicate, and
function symbols.
19. Symbols and interpretations:
• A term is a logical expression that refers to an object.
Constant symbols are therefore terms, but it is not always
convenient to have a distinct symbol to name every object.
• An atomic sentence (or atom for short) is formed from a
predicate symbol optionally followed by a parenthesized list of
terms,
• Richard the Lionheart is the brother of King John
Brother (Richard , John).
20. •Complex sentences
•We can use logical connectives to construct more
complex sentences, with the same syntax and
semantics as in propositional calculus.
•¬Brother (LeftLeg(Richard), John)
•Brother (Richard , John) ∧ Brother (John,Richard)
•King(Richard ) ∨ King(John)
•¬King(Richard) ⇒ King(John) .
21. • “One plus two equals three.”
Objects: one, two, three, one plus two; Relation: equals;
Function: plus. (“One plus two” is a name for the object that is
obtained by applying the function “plus” to the objects “one” and
“two.” “Three” is another name for this object.)
• “Squares neighboring the wumpus are smelly.”
Objects: wumpus, squares; Property: smelly; Relation:
neighboring.
• “Evil King John ruled England in 1200.”
Objects: John, England, 1200; Relation: ruled; Properties:
evil, king.
22. 1) Marcus is a man
Man( Marcus)
2) Marcus was a Pompien
Pompien ( Marcus )
3) All Pompiens were Romans
V x: Pompien (x)=>Roman (x)
4) Every Gardener likes sun
V x: Gardener (x) =>likes (x,sun)
5) All Purple Mushrooms are poisonous
V x: Mushrooms (x) ^ purple (x) => Poisonous (x)
6) Everyone is loyal to someone
V x Ey: loyal (x, y)
7) Everyone loves everyone
VxVy :loves (x,y)
23.
24. KNOWLEDGE ENGINEERING IN FIRST-ORDER LOGIC
• A knowledge engineer is someone who investigates a particular domain, learns what concepts
are important in that domain, and creates a formal representation of the objects and relations in the
domain.
• The knowledge-engineering process :
• Knowledge engineering projects vary widely in content, scope, and difficulty, but all such projects
include the following steps:
1. Identify the task
2. Assemble the relevant knowledge.
3. Decide on a vocabulary of predicates, functions, and constants.
4. Encode general knowledge about the domain.
5. Encode a description of the specific problem instance.
6. Pose queries to the inference procedure and get answers.
7. Debug the knowledge base.
25. Identify the task
-Goal Formulation step
Assemble the relevant knowledge.
- Collect common sense knowledge of model.
Decide on a vocabulary of predicates, functions, and constants.
- Decide on knowledge and it’s to be coded for the given problem.
Encode general knowledge about the domain.
- Encode rules in some knowledge representation language.
Encode a description of the specific problem instance.
- Contains all the facts in given problem.
Step 6 & 7:
- Query the knowledgebase to get the results
- If the answer aren’t accurate, debug the knowledgebase. Repeat steps 6 & 7.
26. Inference in First order Logic:
• Inference in first order logic is used to deduce new facts sentences from existing sentences.
• Before understanding first order logic inference rule lets understand some basic terminologies
used in first order logic.
• Substitution
It is a fundamental operation performed on terms and formulas. It occurs in all inferences system in first
order logic.
F[a/x]
• Equality:-
Fol does not only use predicate & term for making atomic sentences but also use another way, which is
equality in FOL.
27. Eg: - Brother(John) = smith
Here, the object referred by Brother(John) is similar to object referred by smith. The equality
symbol can also be used with negation to represent that two terms are not the same objects.
Eg:- 7 (x=y) which is equivalent to x y
Inference Rule for the quantifiers:
As PL we also have inference rules in First Order Logic.
• Universal Generalization
• Universal Instantiation
• Existential Instantiation
• Existential Introduction
28. • Universal Generalization:
• It is valid inference rule which states that if premise p ( c ) is true for any
arbitrary element C in universe of discourse, then we can have a conclusion
as ∀ x p ( c )
• It can be represented as 𝑝(c ) / ∀ x p ( c )
Eg:- Lets represent p(c) : “A byte contains 8 bits”, so, for ∀ x p ( x ) “all bytes contain 8
bits”. It will also be true.
Universal Instantiation :-
It is also called universal elimination. It can be applied multiple times to add new
sentences.
UI rule state that we can inference any statement p ( c ) by
substituting a ground term c ( a constant within domain x)
from ∀ x p ( x ) for any object in the universal of discourse.
∀ x p ( x ) / p ( c )
Eg: - If “ Every person like ice cream” => ∀ x p ( x ) so we can infer that “ John
like ice cream” => p (c )
29. •Existential Instantiation:-
• It is also called Existential Elimination which is a valid inference
rule in First order Logic.
• This rule states that if there is some element c in the universe of
discourse which has a property p, then we can infer that there exist
something in the universe which has property p.
• p ( c ) / ∃ x p (x)
Eg:- Pinky got good marks in maths “ Therefore, some one got good
marks in maths”.
30. • Unification:
• Lifted inference rules require finding substitutions that make different logical expressions look identical.
This process is called unification and is a key component of all first-order inference algorithms. The
UNIFY algorithm takes two sentences and returns a unifier for them if one exists:
• UNIFY(p, q)=θ where SUBST(θ, p)= SUBST(θ, q) .
• forward-chaining algorithm:
32. • Efficient forward chaining:
• It is designed for ease of understanding rather than for efficiency of operation.
There are three possible sources of inefficiency.
• First, the “inner loop” of the algorithm involves finding all possible unifiers such that
the premise of a rule unifies with a suitable set of facts in the knowledge base.
This is often called pattern matching and can be very expensive.
• Second, the algorithm rechecks every rule on every iteration to see whether its
premises are satisfied, even if very few additions are made to the knowledge base
on each iteration.
• Finally, the algorithm might generate many facts that are irrelevant to the goal. We
address each of these issues in turn.
33. • BACKWARD CHAINING:
• These algorithms work backward from the goal, chaining through rules to find known facts that support the
proof.
• Backward chaining, as we have written it, is clearly a depth-first search algorithm. This means that its space
requirements are linear in the size of the proof (neglecting, for now, the space required to accumulate the
solutions).
• FOL-BC-ASK(KB, goal ) will be proved if the knowledge base contains a clause of the form lhs ⇒ goal,
where lhs (left-hand side) is a list of conjuncts.
• Backward chaining is a kind of AND/OR search—the OR part because the goal query can be proved by any
rule in the knowledge base, and the AND part because all the conjuncts in the lhs of a clause must be
proved.
• FOL-BC-OR works by fetching all clauses that might unify with the goal, standardizing the variables in the
clause to be brand-new variables, and then, if the rhs of the clause does indeed unify with the goal, proving
every conjunct in the lhs, using FOL-BC-AND.
34. • It also means that backward chaining (unlike forward chaining) suffers from problems with repeated
states and incompleteness.
35. •Logic Programming:
• Logic programming is a technology that comes fairly close to
embodying the declarative ideal that systems should be
constructed by expressing knowledge in a formal language
and that problems should be solved by running inference
processes on that knowledge.
• The ideal is summed up in Robert Kowalski’s equation,
• Algorithm = Logic + Control .
• Prolog is the most widely used logic programming language.
36. • Logic Programming:
• It is used primarily as a rapid prototyping language and for symbol-
manipulation tasks such as writing compilers (Van Roy, 1990) and
parsing natural language (Pereira and Warren, 1980).
• Many expert systems have been written in Prolog for legal, medical,
financial, and other domains.
• Prolog programs are sets of definite clauses written in a notation
somewhat different from standard first-order logic.
• Prolog uses uppercase letters for variables and lowercase for
constants—the opposite of our convention for logic.
37. • Commas separate conjuncts in a clause, and the clause is written “backwards” from
what we are used to; instead of A ∧ B ⇒ C in Prolog we have
• C :- A, B. Here is a typical example:
• criminal(X) :- american(X), weapon(Y), sells(X,Y,Z),
hostile(Z).
• The notation [E|L] denotes a list whose first element is E and whose rest is L.
Here is a Prolog program for append(X,Y,Z), which succeeds if list Z is the result
of appending lists X and Y:
append([],Y,Y).
append([A|X],Y,[A|Z]) :- append(X,Y,Z).
• The execution of Prolog programs is done through depth-first
38. • Efficient implementation of logic programs:
• The execution of a Prolog program can happen in two modes:
interpreted and compiled.
• Interpretation essentially amounts to running the FOL-BC-
ASK algorithm with the program as the knowledge base.
• We say “essentially” because Prolog interpreters contain a
variety of improvements designed to maximize speed. Here
we consider only two.
39. • Efficient implementation of logic programs:
• First, our implementation had to explicitly manage the iteration
over possible results generated by each of the subfunctions.
• Prolog interpreters have a global data structure, a stack of
choice points to keep track of the multiple possibilities that we
considered in FOL-BC-OR.
• This global stack is more efficient, and it makes debugging
easier, because the debugger can move up and down the
stack.
40. • Second, our simple implementation of FOL-BC-ASK spends a good deal of time generating
substitutions.
• Instead of explicitly constructing substitutions, Prolog has logic variables that remember their
current binding.
• At any point in time, every variable in the program either is unbound or is bound to some
value.
• Together, these variables and values implicitly define the substitution for the current branch of
the proof.
• Extending the path can only add new variable bindings, because an attempt to add a different
binding for an already bound variable results in a failure of unification.
• When a path in the search fails, Prolog will back up to a previous choice point, and then it
might have to unbind some variables.
41. • There are two choices for representing categories in first-order logic: predicates and
objects.
• What do we need to express? – Categories, Measures, Composite objects, Time, Space,
Change, Events, Processes, Physical Objects, Substances, Mental Objects, Beliefs.
• KR requires the organization of objects into categories. Although – Interaction at the level
of the object – Reasoning at the level of categories
• Categories play a role in predictions about objects – Based on perceived properties
• Categories can be represented in two ways by FOL 1. Predicates: apple(x) 2. Reification of
categories into objects: apples
• Category = set of its members – Example: Member(x, apples), x ∈ apples, –
Subset(apples, fruits), apples ⊂ fruits
42. • Categories serve to organize and simplify the knowledge base
through inheritance.
• Relation = inheritance:
• – All instance of food are edible, fruit is a subclass of food and apples
is a subclass of fruit then an apple is edible.
• – Individual apples inherit the property of edibility from food
• Defines a taxonomy
• – Subclass relations organize categories
43. • Two or more categories are disjoint if they are mutually exclusive –
Disjoint({Animals, Vegetables})
• A decomposition of a class into categories is called exhaustive if each
object of the class must belong to at least one category
• – living = {animal, vegetable, fungi, bacteria}
• A partition is an exhaustive decomposition of a class into disjoint
subsets.
• – student = {undergraduate, graduate}
44. • Physical composition
• One object may be part of another:
• – PartOf(Romania,EasternEurope)
• – PartOf(EasternEurope,Europe)
• The PartOf predicate is transitive (and reflexive), so we can infer that
PartOf(Romania,Europe)
• More generally:
• – ∀ x PartOf(x,x) – ∀ x,y,z PartOf(x,y) ∧ PartOf(y,z) ⇒ PartOf(x,z)
• Logical Minimization: Defining an object as the smallest one satisfying
certain conditions.
45. • Measurements
• Objects have height, mass, cost,....
• – Values that we assign to these properties are called measures
• Combine Unit functions with a number to measure line segment object L1
• – Length(L1) = Inches(1.5) = Centimeters(3.81).
• Conversion between units:
• – ∀ i Centimeters(2.54 x i)=Inches(i).
• Some measures have no scale:
• – Beauty, Difficulty, etc.
• – Most important aspect of measures is not numerical values, but measures
can be orderable.
• – An apple can have deliciousness .9 or .1
46. • Objects: Things and stuff
• Stuff: Defy any obvious individuation after division into distinct objects
(mass nouns)
• Things: Individuation(count nouns)
• Example: Butter(stuff) and Cow(things) b ∈ Butter ∧ PartOf(p, b) ⇒ p ∈
Butter
47.
48.
49.
50. • Simulated Annealing
• A hill-climbing algorithm that never makes “downhill” moves toward states
with lower value (or higher cost) is guaranteed to be incomplete, because it
can get stuck on a local maximum.
• In contrast, a purely random walk—that is, moving to a successor chosen
uniformly at random from the set of successors—is complete but extremely
inefficient.
• Therefore, it seems reasonable to try to combine hill climbing with a random
walk in some way that yields both efficiency and completeness.
• In metallurgy, SIMULATED ANNEALING annealing is the process used to
temper or harden metals and glass by heating them to a high temperature
and then gradually cooling them, thus allowing the material to reach a low
energy crystalline state.
• Simulated annealing is an algorithm that combines hill-climbing with a
random walk in someway that yields both efficiency and completeness.
51. • Simulated Annealing
• In metallurgy, SIMULATED ANNEALING annealing is the process
used to temper or harden metals and glass by heating them to a
high temperature and then gradually cooling them, thus allowing
the material to reach a low energy crystalline state.
• Instead of picking the best move, however, it picks the random
move.
• If the move improves the situation, it is always accepted.
• Otherwise, the algorithm accepts the move with some probability
less than 1.
• The probability decreases exponentially with the “badness” of the
move – the amount E by which the evaluation is worsened.
52. •The Different between stimulated annealing and
the simple hill – climbing procedure are given
below:
•The annealing schedule must be maintained.
•Moves to worse states may to be accepted.
•It is good idea to maintain the current state, best
state found so far. If the final state is worse than
earlier state the earlier state is still available.
53. •Two Changes:
•The Objective function is used in place of the term
heuristic function.
•Minimize rather than maximize the value of the objective
function. We describe a process of valley descending
rather than hill climbing.
54.
55. LOCAL SEARCH IN CONTINUOUS SPACES
We have considered algorithms that work only in discrete environments, but real-world
environment are continuous
Local search amounts to maximizing a continuous objective function in a multi-
dimensional vector space.
This is hard to do in general.
Can immediately retreat
o Discretize the space near each state
o Apply a discrete local search strategy (e.g., stochastic hill climbing, simulated
annealing)
Often resists a closed-form solution
o Fake up an empirical gradient
o Amounts to greedy hill climbing in discretized state space
Can employ Newton-Raphson Method to find maxima
Continuous problems have similar problems: plateaus, ridges, local maxima, etc.
56. • Online Search Agents and Unknown Environments
• Online search problems
Offline Search (all algorithms so far)
Compute complete solution, ignoring environment Carry out action sequence
Online Search
Interleave computation and action
Compute—Act—Observe—Compute—·
Online search good
For dynamic, semi-dynamic, stochastic domains
Whenever offline search would yield exponentially many contingencies
Online search necessary for exploration problem
States and actions unknown to agent
Agent uses actions as experiments to determine what to do
57. Examples
Robot exploring unknown building
Classical hero escaping a labyrinth
Assume agent knows
Actions available in state s
Step-cost function c(s,a,s′)
State s is a goal state
When it has visited a state s previously Admissible heuristic function h(s )
Note that agent doesn’t know outcome state (s ′ ) for a given action (a) until it tries the
action (and all actions from a state s )
Competitive ratio compares actual cost with cost agent would follow if it knew the
search space
No agent can avoid dead ends in all state spaces
Robotics examples: Staircase, ramp, cliff, terrain
Assume state space is safely explorable—some goal state is always reachable
58. Online Search Agents
Interleaving planning and acting hamstrings offline search
A* expands arbitrary nodes without waiting for outcome of action Online
algorithm can expand only the node it physically occupies Best to explore nodes in
physically local order
Suggests using depth-first search
Next node always a child of the current
When all actions have been tried, can’t just drop state Agent must physically backtrack
Online Depth-First Search
May have arbitrarily bad competitive ratio (wandering past goal) Okay for
exploration; bad for minimizing path cost
Online Iterative-Deepening Search
Competitive ratio stays small for state space a uniform tree
59. Online Local Search
Hill Climbing Search
Also has physical locality in node expansions is, in fact, already an online search
algorithm
Local maxima problematic: can’t randomly transport agent to new state in effort to escape
local maximum
Random Walk as alternative
Select action at random from current state
Will eventually find a goal node in a finite space
Can be very slow, esp. if “backward” steps as common as “forward”
Hill Climbing with Memory instead of randomness
Store “current best estimate” of cost to goal at each visited state Starting estimate is just
h(s )
Augment estimate based on experience in the state space Tends to “flatten out” local
minima, allowing progress Employ optimism under uncertainty
Untried actions assumed to have least-possible cost Encourage exploration of untried paths
60. Learning in Online Search
o Rampant ignorance a ripe opportunity for learning Agent
learns a “map” of the environment
o Outcome of each action in each state
o Local search agents improve evaluation function accuracy
o Update estimate of value at each visited state
o Would like to infer higher-level domain model
o Example: “Up” in maze search increases y -coordinate
Requires
o Formal way to represent and manipulate such general rules
(so far, have hidden rules within the successor function)
o Algorithms that can construct general rules based on
observations of the effect of actions