Planning and Machine learning, Strips, K Strips

1
ESTD: 2003
Dr. Vishwanath V Murthy, MCA, RNSIT
Subject:
Subject Code:
22MCA262
Artificial Intelligence

2
ESTD: 2003
ESTD: 2003
4.1) Planning and machine learning
4.2) Basic plan generation system – Strips
4.3) Advanced plan generation system – K Strips
4.4) Strategic explanation
- Why, why not, and how explanation
4.5) Learning
a) Machine learning
b) Adaptive learning
Module-4: Planning and Machine learning

3
ESTD: 2003

4
ESTD: 2003
Planning  Set of actions
Action  Preconditions (Need to be satisfied)
 Components of Planning System:
In any general problem-solving systems, elementary techniques to perform are
a) Choose the best rule (based on heuristics) to be applied
b) Apply the chosen rule to get new problem state
c) Detect when a solution has been found
d) Detect dead ends so that new directions are explored.

5
ESTD: 2003
 The block world planning problem,
 All Square blocks of same size
 Blocks can be stacked one upon another.
 Flat surface (table) on which blocks can be placed.
 Only one block can be moved at a time to achieve the target.
Planning for Block World Problem

6
ESTD: 2003
 The five predicates used and their intended meaning
1. ON(A,B) : Block A is on B
2. ON(A, table) : A is on table
3. CLEAR(A) : Nothing is on top of A
4. HOLDING(A) : Arm is holding A.
5. ARMEMPTY : Arm is holding nothing
Example: Block World Problem

7
ESTD: 2003
 Robot does four types of operations.
Rule/operation Precondition Add Delete
PICKUP (X):
STACK (X, Y):
UNSTACK (X, Y)
PUTDOWN (X):
holding(X)
On(X, table)
clear(X)
ARMEMPTY
holding(X)
clear(Y)
On(X, Y)
clear(X)
ARMEMPTY
holding(X)
On(X, table)
clear(X)
ARMEMPTY
On(X, Y)
clear(X)
holding(X)
clear(Y)
On(X, table)
clear(X)
ARMEMPTY
holding(X)
clear(Y)
holding(X)
On(X, Y)
clear(X)
ARMEMPTY

8
ESTD: 2003
ESTD: 2003
4.5) Learning
a) Machine learning

9
ESTD: 2003
 One of the most influential approaches to automated planning is
 STRIPS – (Stanford Research Institute Problem Solver)
 STRIPS is a language used for expressing planning problems.
 It is primarily concerned with the automatic generation of plans, which are
sequences of actions.
 Knowledge Representation: First Order Logic.
 Algorithm: Forward chaining on rules.
4.2) STRIPS .

10
ESTD: 2003
 The STRIPS algorithm operates by maintaining a database of predicates that
describe the state of the world.
 Planning with STRIPS:
1. Define the Initial State: Where the system starts.
2. Set the Goal State: What the system should achieve.
3. Develop Actions: Defined by their preconditions and effects.
4. Search for Solutions: Using a strategy like backward chaining from the goal state
to the initial state, identifying actions that satisfy the goal conditions.
4.2) STRIPS .

11
ESTD: 2003
 Components of STRIPS:
• States: Defined by a set of logical propositions.
• Goals: Specified as a set of conditions that describe the desired
outcome.
• Actions: Each action in STRIPS is characterized by three components:
• Preconditions: Conditions that must be true for the action to be executed.
• Add Effects: Conditions that become true due to executing the action.
• Delete Effects: Conditions that become false due to executing the action.
4.2) STRIPS .

12
ESTD: 2003
 The five predicates used and their intended meaning
1. ON(A,B) : Block A is on B
2. ON(A, table) : A is on table
3. CLEAR(A) : Nothing is on top of A
4. HOLDING(A) : Arm is holding A.
5. ARMEMPTY : Arm is holding nothing

13
ESTD: 2003
 Example: Using STRIPS for Blocks World
4.2) STRIPS .

14
ESTD: 2003
4.2) STRIPS .

15
ESTD: 2003
4.2) STRIPS .

16
ESTD: 2003
4.2) STRIPS .

17
ESTD: 2003
4.2) STRIPS .

18
ESTD: 2003
4.2) STRIPS .

19
ESTD: 2003
ESTD: 2003
4.5) Learning
a) Machine learning

20
ESTD: 2003
 Modal Operator K :
 We are familiar with the use of connectives ‘ ’
∧ and ‘V’ in logics.
 These connectives (operators) construct more complex formulas from simpler
components.
 In K STRIPS,
− we want to construct a formula whose intended meaning is that a certain agent
knows a certain proposition.
− The components consist of
o a Term denoting the agent and
o a formula denoting a proposition that the agent knows.
− To accomplish this, modal operator K is introduced
4.3) K STRIPS.

21
ESTD: 2003
 For example,
 To say that Robot (name of agent) know that block A is on block B, then write,
K( Robot, On(A,B))
 The sentence formed by combining K with the term Robot and the formula
On(A,B) gets a new formula, the intended meaning of which is “Robot knows
that block A is on block B”.
 The words “knows” and “belief” is different in meaning.
 That means an agent can believe a false proposition, but it cannot know
anything that is false.
4.3) K STRIPS.

22
ESTD: 2003
 Some examples,
1. K(Agent1, K(Agent2, On(A,B) )
 Means Agent1 knows that Agent2 knows that A is on B.
2. K(Agent1, On(A,B)) V K(Agent1, On(A,C) )
 Means that either Agent1 knows that Ais on B or it knows that A is on C.
3. K(Agent1, On(A,B) ) V K(Agent1, ¬On(A, B) )
 Means that either Agent1 knows whether or not A is on B.
4.3) K STRIPS.

23
ESTD: 2003
 Example in Planning Speech Action:
 We can treat speech acts just like other agent systems.
 Our agent can use a plan-generating system to make plans comprising speech acts
and other actions.
 To do so, it needs a model of the effects of these actions.
 Consider for example,
− Tell( A, φ ) , where A is Agent and φ is true.
 We could model the effects of that action by the STRIPS rule :
Tell( A, φ ) :
− Precondition: Next_to(A) φ ¬K(A, φ)
∧ ∧
− Delete: ¬K(A, φ)
− Add: K(A, φ)
4.2) K STRIPS.

24
ESTD: 2003
 We could model the effects of that action by the STRIPS rule :
Tell( A, φ ) :
− Precondition: Next_to(A) φ ¬K(A, φ)
∧ ∧
− Delete: ¬K(A, φ)
− Add: K(A, φ)
 The precondition Next_to(A) ensures that our agent is close to agent A to enable
communication.
 The precondition φ is imposed to ensure that our agent actually believes φ before
it can inform another agent about the truth.
 The precondition ¬K(A, φ) ensure that our agent does not communicate
redundant information.
4.2) K STRIPS.

25
ESTD: 2003
ESTD: 2003
4.5) Learning
a) Machine learning

26
ESTD: 2003
 What is Strategic Explanation?
 The practice of explaining AI decisions in a way that aligns with human
understanding and goals.
 Focuses on three core types of explanations:
− Why, Why Not, and How.
 Why is it important?
 Transparency: Enables users to trust AI decisions.
 Regulatory compliance: Many sectors require explanations for decisions made by AI
systems.
 User engagement: Improves usability and satisfaction.

27
ESTD: 2003
1. Why Explanation
 Explains why a specific decision or recommendation was made by AI system.
 Provides the reasoning behind an AI’s outcome
 Helps users understand the system's logic.
 Builds trust and confidence in the AI system.
 Example:
 Medical diagnosis system:
− Explains Why an AI system recommended a certain treatment
− Reason given may be: because the patient’s symptoms matched a specific pattern
seen in historical data.

28
ESTD: 2003
1. Why Explanation
 Techniques:
 Feature importance: Highlights key factors influencing the decision. (like, age, Income
symptoms, etc).
 Rule-based explanations: Simple “if-then” rules that describe why the decision was made.
 Challenges:
• Complex AI models (e.g., deep learning), making “why” explanations harder to extract.

29
ESTD: 2003
2. Why Not Explanation
 Explains why an expected or alternative outcome was not selected..
 Clarifies why a particular decision was rejected.
 Manages user expectations and reduces dissatisfaction.
 Example:
 Loan application system:
− AI system might explain why a loan was denied (e.g., insufficient credit score).

30
ESTD: 2003
2. Why Not Explanation
 Techniques:
 Counterfactual explanations:
− Describes what would need to change for the alternative outcome.
− Ex: “If your credit score had been 50 points higher, your loan would have been approved.”
 Contrastive explanations:
− Compares the actual decision that was made with the alternative decision.
 Challenges:
 Developing accurate counterfactuals or alternatives in high-dimensional data spaces
can be difficult and computationally expensive.

31
ESTD: 2003
3. How Explanation
 Explains how the AI system arrived at the decision, revealing internal processes.
 Makes the decision-making process transparent, particularly for technical users
(developers/data scientists).
 Example:
 Deep neural network system,
− Shows how different layers transformed inputs into outputs..

32
ESTD: 2003
3. How Explanation
 Techniques:
 Model visualization: Shows Graphical representations of processing Input (e.g., layers
in a neural network).
 Attention maps: Highlight parts of the input that the model focused on during
decision-making.
 Challenges:
 For non-technical users, these types of explanations can be too detailed and hard to
understand, requiring simplification or abstraction.

33
ESTD: 2003
ESTD: 2003
4.5) Learning
4.6) Machine learning
4.7) Adaptive learning

34
ESTD: 2003
 What is learning
 Definition: Changes in the system that are adaptive in the sense that they enable the
system to do the same task more efficiently and more effectively the next time.
 Criticism of AI: An AI system cannot be called intelligent until It can learn to do new
things and adapt to new situations, rather than simply doing as they are told to do.
 Learning covers a wide range of phenomenon:
1) Skill Refinement: Practice makes skills improve. More you play tennis, better you get.
2) Knowledge Acquisition: Knowledge is generally acquired through experience
4.5) Learning .

35
ESTD: 2003
 Various Learning Mechanism
1) Rote learning
2) Learning by taking advice
3) Learning from problem solving
4) Learning from examples
4.5) Learning .

36
ESTD: 2003
 When a computer stores a piece of data, it is performing a basic form of
learning.
 In case of data caching, we store computed values so that we need not to re-
compute them later. When computation is more expensive than recall, this
strategy can save a significant amount of time.
 Caching has been used in AI programs to produce some surprising performance
improvements. Such caching is known as rote learning.
 Rote learning shows the need for some capabilities such as: –
 Organized Storage of information
 Generalization
4.5.1) ROTE Learning .

37
ESTD: 2003
4.5.1) ROTE Learning .
 In figure-a: value of A is computed as 10.
This value is stored for some future use.
 In figure-b: value of A required again to
solve new game tree. Instead of re-
computing the value of node A, rote
learning is being used and
stored value of node A is directly applied.
Fig: Storing backed-up values
10

38
ESTD: 2003
1) Rote learning
4.5) Learning .

39
ESTD: 2003
4.5.2) Learning by taking advice.
 When a programmer writes a series of instructions into a computer, a basic type
of learning is taking place: The programmer is sort of a teacher, and
the computer is a sort of student.
 After being programmed, the computer is now able to do something it previously
could not.
 Suppose the program is written in high level language such as Prolog, some
interpreter or compiler must intervene to change the teacher’s instructions into
code that the machine can execute directly.

40
ESTD: 2003
 A program called FOO, which accepts advice for playing “hearts”, a card game. A human
user first translates the advice from English into a representation that FOO can
understand.
 For example:
→ "Avoid taking points" becomes:
 (avoid (take-points me) (trick))
→ By UNFOLDing the definition of avoid, FOO comes up with:
 (achieve (not (during (trick) (take-points me))))

41
ESTD: 2003
→ FOO considers the advice to apply to the player called "me." Next, FOO UNFOLDs the
definition of trick:
 (achieve (not (during
(scenario
(each pI (players) (play-card p1))
(take-trick (trick-winner))
(take-points me))))

42
ESTD: 2003
1) Rote learning
4.5) Learning .

43
ESTD: 2003
4.5.3) Learning in Problem solving .
 Can program get better without the aid of a teacher?• It can be by generalizing
from its own experiences.
 Various techniques are as follows:
a) A Learning by Parameter Adjustment
b) Learning with Macro Operators
c) Learning by Chunking

44
ESTD: 2003
a) A Learning by Parameter Adjustment
 Many programs rely on an evaluation procedure that combines information from several
sources into a single summary statistic.
1) Game playing programs do this in their static evaluation functions in
which a variety of factors such as piece advantage and mobility are combined into a
single score reflecting the desirability (goodness or usefulness) of a particular board
position.
2) Pattern classification programs often combine several features to determine the
correct category into which a given stimulus should be placed.
 In designing such programs, it is often difficult to know a priori how much weight
should be attached to each feature being used.
 One way of finding the correct weights is to begin with some estimate of the correct
settings and then to let the program modify the settings based on its experience.

45
ESTD: 2003
 Sequences of actions that can be treated as a whole are called macro-operators.
 Example:
 Suppose we want to go to the main post office of the city. Our solution may involve
getting in your car, starting it and driving along a certain route. Substantial planning
may go into choosing the appropriate route.
 Here, we need not to plan about “how to start the car”. We can use “START-CAR” as an
atomic action.
 It consists of several primitive actions like
1) Sitting down,
2) Adjusting the mirror,
3) Inserting the key and
4) Turning the key.

46
ESTD: 2003
 Macro-operators were used in the early problem solving system
 STRIPS, After each problem solving episode, the learning component takes
the computed plan and stores it away as a macro-operator, or MACROP.
 MACROP is just like a regular operator, except that it consists of a sequence
of actions, not just a single one.
not just a single one.
not just a single one.

47
ESTD: 2003
c) Learning by chunking
 Chunking is a process similar in flavor to macro-operators. The idea of chunking comes
from the psychological literature on memory and problem solving. Its computational
basis is in Production systems.
 When a system detects useful sequence of production firings, it creates chunk, which is
essentially a large production that does the work of an entire sequence of smaller ones.
 SOAR is an example production system which uses chunking.
 Chunks learned during the initial stages of solving a problem are applicable in the
later stages of the same problem-solving episode. After a solution is found, the chunks
remain in memory, ready for use in the next problem.
 At present, chunking is inadequate for duplicating the contents of large directly-
computed macro-operator tables

48
ESTD: 2003
1) Rote learning
4.5) Learning .

49
ESTD: 2003
4.5.4) Learning from examples .
 Classification is the process of assigning, to a particular input, the name of a class to
which it belongs. The classes from which the classification procedure can choose can be
described in a variety of ways.
 Their definition will depend on the use to which they are put. Classification is an
important component of many problem-solving tasks.
 Before classification can be done, the classes it will use must be defined:
1) Isolate a set of features that are relevant to the task domain. Define each class by a
weighted sum of values of these features. Exa: task is weather prediction, the
parameters can be measurements such as rainfall, location of cold fronts etc.
2) Isolate a set of features that are relevant to the task domain. Define each class as a
structure composed of these features. Ex: classifying animals, various features can be
such things as color, length of neck etc.
 The idea of producing a classification program that can evolve its own class definitions
is called concept learning or induction.

50
ESTD: 2003
4.5.4) Learning from examples .
Winston’s Learning Program
 This program is an early structural concept learning program.It operates in a simple
blocks world domain. Its goal was to construct representations of the definitions of
concepts in blocks domain.
 For example, it learned the concepts House, Tent and Arch.
 A near miss is an object that is not an instance of the concept in question but that is
very similar to such instances.

51
ESTD: 2003
ESTD: 2003
4.5) Learning
4.6) Machine learning
4.7) Adaptive learning

52
ESTD: 2003
 Definition
 Machine Learning is a subset of Artificial Intelligence
 Allows systems to learn from data, improve over time, and make decisions or
predictions without being explicitly programmed.
 Learn from past experiences
4.5.1) Machine Learning .

53
ESTD: 2003
 Types of Learning in ML:
a) Supervised Learning: Learning from labeled data.
b) Unsupervised Learning: Discovering hidden patterns in unlabeled data.
c) Reinforcement Learning: Learning through trial and error based on rewards.

54
ESTD: 2003
a) Supervised Learning: Learning from labelled data.
• Supervised learning is the most common type of
machine learning.
• It involves training an algorithm to make
predictions based on labelled data.
• Labelled data is data that has already been
categorized or classified by humans.
• The goal of supervised learning is to create a
model that can accurately predict the label of
new, unseen data.

55
ESTD: 2003
b) Supervised Learning.
• Unsupervised learning is used when there is no labelled data available.
• The goal of unsupervised learning is to identify patterns or structure in the data without
any prior knowledge of what the data represents.
• This can be useful for tasks such as clustering, where the goal is to group similar data
points together.
• There are two main types of unsupervised learning:
 clustering and dimensionality reduction.

56
ESTD: 2003
c) Reinforcement Learning.
• Reinforcement Learning (RL) is a subfield of machine learning that focuses on training
agents to make decisions in an environment, in order to maximize a cumulative reward.
• Unlike supervised and unsupervised learning, which deal with labelled and unlabelled
datasets respectively, RL deals with decision-making in a dynamic and interactive setting.

57
ESTD: 2003
ESTD: 2003
4.5) Learning
a) Machine learning

58
ESTD: 2003
 Adaptive learning combines technology and artificial intelligence (AI) to
create personalised learning paths.
 Adaptive learning refers broadly to a learning process that adapts based on the
responses of the individual student.
 Adaptive learning is an educational method which uses computers as interactive
teaching devices,
 Computers adapt the presentation of educational material according to students'
learning needs, as indicated by their responses to questions, tasks and
experiences.
4.5.2) Adaptive Learning .

59
ESTD: 2003
 Components of Adaptive learning
 Expert model - The model with the information which is to be taught
 Student model - The model which tracks and learns about the student
 Instructional model - The model which actually conveys the information

60
ESTD: 2003
a) Expert model –
 The expert model stores information about the material which is being taught.
 This can be as simple as the solutions for the question set but it can also
include lessons and tutorials and, in more sophisticated systems, even expert
methodologies to illustrate approaches to the questions.
 Adaptive learning systems which do not include an expert model will typically
incorporate these functions in the instructional model

61
ESTD: 2003
b) Student model –

Determining a student's skill level is the method employed in CAT (Computerized
adaptive testing).
 In CAT, the subject is presented with questions that are selected based on their
level of difficulty in relation to the presumed skill level of the subject.
 As the test proceeds, the computer adjusts the subject's score based on their
answers, continuously fine-tuning the score by selecting questions from a
narrower range of difficulty.

62
ESTD: 2003
 An algorithm for a CAT-style assessment is simple to implement.
 A large pool of questions is amassed and rated according to difficulty, through
expert analysis, experimentation, or a combination of the two.
 The computer then performs what is essentially a binary search, always giving
the subject a question which is halfway between what the computer has already
determined to be the subject's maximum and minimum possible skill levels.
 These levels are then adjusted to the level of the difficulty of the question,
reassigning the minimum if the subject answered correctly, and the maximum if
the subject answered incorrectly.

63
ESTD: 2003
 An algorithm for a CAT-style assessment is simple to implement.
 Obviously, a certain margin for error has to be built in to allow for scenarios
where the subject's answer is not indicative of their true skill level but simply
coincidental.
 Asking multiple questions from one level of difficulty greatly reduces the
probability of a misleading answer, and allowing the range to grow beyond the
assumed skill level can compensate for possible misevaluations.

64
ESTD: 2003
c) Instructional model –
 The instructional model generally looks to incorporate the best educational tools that
technology has to offer (such as multimedia presentations) with expert teacher advice
for presentation methods.
 In a CAT-style student model, the instructional model will simply rank lessons in
correspondence with the ranks for the question pool.
 When the student's level has been satisfactorily determined, the instructional model
provides the appropriate lesson.
 The instructional model can be designed to analyse the collection of weaknesses and
tailor a lesson plan accordingly.

65
ESTD: 2003
c) Instructional model –
 When the incorrect answers are being evaluated by the student model, some systems
look to provide feedback to the actual questions in the form of 'hints’.
 As the student makes mistakes, useful suggestions pop up such as "look carefully at the
sign of the number".

Planning and Machine learning, Strips, K Strips

More Related Content

What's hot

Similar to Planning and Machine learning, Strips, K Strips

Recently uploaded

Planning and Machine learning, Strips, K Strips