PPT ON INTRODUCTION TO AI- UNIT-1-PART-2.pptx

Agent
• Human agent
• Robotic agent
• Software agent
• Agent function – abstract mathematical
description
• Agent program – concrete implementation

Four agent programs
1) Simple reflex agent – current percept
2) Model based reflex agent – two kinds of
knowledge
3) Goal based agent – goal information
4) Utility based agent – performance measures

Problem formulation
• Initial state
• Actions
• Transition model
• Goal test
• Path cost information

Searching
Problem solving agents, formulating problems, search strategies

Problem solving by
searching
Rational agents need to perform sequences of actions in
order to achieve goals.
•Intelligent behavior can be generated by having a look-up
table or reactive policy that tells the agent what to do in
every circumstance, but:
-Such a table or policy is difficult to build
-All contingencies must be anticipated
•A more general approach is for the agent to have
knowledge of the world and how its actions affect it and be
able to simulate execution of actions in an internal model of
the world in order to determine a sequence of actions that
will accomplish its goals.
•This is the general task of problem solving and is typically
performed by searching through an internally modelled
space of world states.

Problem Solving Task
•Given:
-An initial state of the world
-A set of possible actions or operators that can
be performed.
-A goal test that can be applied to a single state of the
world to determine if it is a goal state.
•Find:
-A solution stated as a path of states and operators that
shows how to transform the initial state into one that
satisfies the goal test.
•The initial state and set of operators implicitly define a state
space of states of the world and operator transitions
between them. May be infinite.

Measuring Performance
•Path cost: a function that assigns a cost to a path, typically
by summing the cost of the individual operators in the path.
May want to find minimum cost solution.
•Search cost: The computational time and space (memory)
required to find the solution.
•Generally there is a trade-off between path cost and search
cost and one must satisfice and find the best solution in
the time that is available.

Search Algorithm
•Easiest way to implement various search strategies is to
maintain a queue of unexpanded search nodes.
•Different strategies result from different methods for
inserting new nodes in the queue.
function GENERAL-SEARCH returns a solution, or failure
initialize the search tree using the initial state of problem
loop do
if there are no candidates for expansion then return failure
choose a leaf node for expansion according to strategy
if the node contains a goal state then return the corresponding solution
else expand the node and add the resulting nodes to the search tree
end

Criteria for search strategies
• Completeness: is the strategy guaranteed to find a
solution when there is one?
• <) Time complexity: how long does it take to find a
solution?
• <) Space complexity: how much memory does it need to
perform the search?
• <} Optimality: does the strategy find the highest-quality
solution when there are several different solutions?

Types of Search strategies
• Uninformed search strategies (blind, exhaustive, brute
force)
do not guide the search with any additional
information about the problem.
•Informed search strategies (heuristic, intelligent) use
information about the problem (estimated distance from a
state to the goal) to guide the search.

Uninformed Search
• Breadth First Search
• Uniform cost search
• Depth First Search
• Depth limited Search
• Iterative Deepening Depth First Search
• Bidirectional Search

Breadth First Search
• Expands search nodes level by level, all nodes at level d
are expanded before expanding nodes at level d+1
• Implemented by adding new nodes to the end of the
queue(FIFO queue):
• GENERAL-SEARCH(problem, ENQUEUE-AT-END)
• Since eventually visits every node to a given depth,
guaranteed to be complete.
• Also optimal provided path cost is a nondecreasing
function of the depth of the node (e.g. all operators of equal
cost)since nodes explored in depth order.

BFS Algorithm
• function BREADTH-FIRST-SEARCH(problem) returns a solution, or
failure
• node ←a node with STATE = problem.INITIAL-STATE, PATH-COST = 0
• if problem.GOAL-TEST(node.STATE) then return SOLUTION(node)
• frontier ←a FIFO queue with node as the only element
• explored ←an empty set
• loop do
o if EMPTY?( frontier) then return failure
o node←POP( frontier ) /* chooses the shallowest node in frontier */
o add node.STATE to explored
o for each action in problem.ACTIONS(node.STATE) do
• child ←CHILD-NODE(problem, node, action)
• if child .STATE is not in explored or frontier then
o if problem.GOAL-TEST(child .STATE) then return
SOLUTION(child )
o frontier ←INSERT(child , frontier )

while queue:
# Dequeue a vertex from
# queue and print it
s = queue.pop(0)
print (s, end = " ")
# Get all adjacent vertices of the
# dequeued vertex s. If a adjacent
# has not been visited, then mark it
# visited and enqueue it
for i in self.graph[s]:
if visited[i] == False:
queue.append(i)
visited[i] = True
# Driver code
# Create a graph given in
# the above diagram
g = Graph()
g.addEdge(0, 1)
g.addEdge(0, 2)
g.addEdge(1, 2)
g.addEdge(2, 0)
g.addEdge(2, 3)
g.addEdge(3, 3)
print ("Following is Breadth First Traversal"
" (starting from vertex 2)")
g.BFS(2)
# from a given source vertex. BFS(int s)
# traverses vertices reachable from s.from collections
import defaultdict
# This class represents a directed graph
# using adjacency list representation
class Graph:
# Constructor
def __init__(self):
# defa cvgt5tfult dictionary to store graph
self.graph = defaultdict(list)
# function to add an edge to graph
def addEdge(self,u,v):
self.graph[u].append(v)
# Function to print a BFS of graph
def BFS(self, s):
# Mark all the vertices as not visited
visited = [False] * (max(self.graph)+1)
# Create a queue for BFS
queue = []
# Mark the source node as
# visited and enqueue it
queue.append(s)
visited[s] = True

Uniform Cost Search
• Uniform cost search modifies the breadth-first strategy
by always expanding the lowest-cost node on the fringe
(as measured by the path cost g(n)),
• rather than the lowest-depth node.
• It is easy to see that breadth-first search is just uniform
cost search with g(n) = DEPTH(«).

In the below tree structure, we have shown the traversing of the tree using BFS algorithm from the root node S to goal node K.
BFS search algorithm traverse in layers, so it will follow the path which is shown by the dotted arrow, and the traversed path will
be:
1.S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
Time Complexity: Time Complexity of BFS algorithm can be obtained by the number of nodes traversed in BFS until the
shallowest Node. Where the d= depth of shallowest solution and b is a node at every state.
T (b) = 1+b2+b3+.......+ bd= O (bd)
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of frontier which is O(bd).
Completeness: BFS is complete, which means if the shallowest goal node is at some finite depth, then BFS will find a solution.
Optimality: BFS is optimal if path cost is a non-decreasing function of the depth of the node.

Depth First Search
• Always expand node at deepest level of the tree, i.e. one of
the most recently generated nodes.
• When hit a dead-end, backtrack to last choice.
• Implemented by adding new nodes to front of the queue:
• GENERAL-SEARCH(problem, ENQUEUE-AT-FRONT)
• It Uses Stack data structure

Depth First Search Properties
• Not guaranteed to be complete since might get lost
following infinite path.
• Not guaranteed optimal since it can find deeper solution
before shallower ones are explored.
• Time complexity in worst case is still O(bd) since need to
explore entire tree. But if many solutions exist may find
one quickly before exploring all of the space.
• Space complexity is only O(bm) where m is maximum
depth of the tree since queue just contains a single path
from the root to a leaf node along with remaining sibling
nodes for each node along the path.
• Can impose a depth limit, l, to prevent exploring nodes
beyond a given depth. Prevents infinite regress, but
incomplete if no solution within depth limit.

Depth Limited Search
• Depth-limited search avoids the pitfalls of depth-first search by imposing a
cutoff on the maximum depth of a path. This cutoff can be implemented with a
special depth-limited search algorithm, or by using the general search
algorithm with operators that keep track of the depth.
• For example, on the map of Romania, there are 20 cities, so we know that if
there is a solution, then it must be of length 19 at the longest. We can
implement the depth cutoff using operators of the form "If you are in city A and
have travelled a path of less than 19 steps, then generate a new state in city B
with a path length that is one greater." With this new operator set, we are
• guaranteed to find the solution if it exists, but we are still not guaranteed to
find the shortest solution first: depth-limited search is complete but not
optimal. If we choose a depth limit that is too small, then depth-limited search
is not even complete.
• The time and space complexity of depth-limited search is similar to depth-first
search. It takes O(b') time and O(bl) space, where /is the depth limit.

In the below search tree, we have shown the flow of depth-first search, and it will
follow the order as:
Root node--->Left node ----> right node.
It will start searching from root node S, and traverse A, then B, then D and E, after
traversing E, it will backtrack the tree as E has no other successor and still goal node
is not found. After backtracking it will traverse node C and then G, and here it will
terminate as it found goal node.

Completeness: DFS search algorithm is complete within finite state space as it will
expand every node within a limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed
by the algorithm. It is given by:
T(n)= 1+ n2+ n3 +.........+ nm=O(nm)
Where, m= maximum depth of any node and this can be much larger than d (Shallowest
solution depth)
Space Complexity: DFS algorithm needs to store only single path from the root node,
hence space complexity of DFS is equivalent to the size of the fringe set, which is O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a large number of steps
or high cost to reach to the goal node.

Constraint Satisfaction
Problem
• Efficient way of solving wide variety of problems
• Use factored representation for each state
• A problem is solved when each variable has a value that
satisfies all the constraints on the variable.
• A problem described this way is called a Constraint
Satisfaction Problem, or CSP.

Problem
• A constraint satisfaction problem (or CSP) is a special kind of
problem that satisfies some additional structural properties
beyond the basic requirements for problems in general.
• In a CSP, the states are defined by the values of a set of variables
and the goal test specifies a set of constraints that the values
must obey.
• For example, the 8-queens problem can be viewed as a CSP in
which the variables are the locations of each of the eight queens;
the possible values are squares on the board; and the
constraints state that no two queens can be in the same row,
column or diagonal.
• A solution to a CSP specifies values for all the variables such
that the constraints are satisfied.
• Cryptarithmetic and VLSI layout can also be described as CSPs
(Exercise 3.20).

Constraint Satisfaction Problems (CSPs)
• Standard search problem: CSP consists of three components : X,D
and G
• CSP:
o X is defined by set of variables ,{X1 , X2, …., Xn }
o D is a set of domains , {D1 , D2, …., Dn },one for each variable
o goal test is a set of constraints specifying allowable combinations of
values for subsets of variables
o Constraint C1, consists of a pair {scope,rel}
where scope is a tuple of variables that participate in the constraint
and rel is the relation that defined the values that those variables can take
on.
• Simple example of a formal representation language
• Allows useful general-purpose algorithms with more power
than standard search algorithms

CSP…
• A relation can be represented as an explicit list of all tuples of
values that satisfy the constraint,
• or as an abstract relation that supports two operations: testing if
a tuple is a member of the relation and enumerating the
members of the relation.
• For example, if X1 and X2 both have the domain {A,B}, then the
constraint saying the two variables must have different values,
can be written as (X1, X2), [(A,B), (B,A)] or as <(X1, X2), X1 = X2 >
• To solve a CSP, we need to define a state space and the notion of
a solution.
• Each state in a CSP is defined by an assignment of values to
some or all of the variables, {Xi= vi , Xj = vj , . . .}.
• An assignment that does not violate any constraints is called a
consistent or legal assignment.
• A complete assignment is one in which every variable is
assigned, and a solution to a CSP is a consistent, complete
assignment.
• A partial assignment is one that assigns values to only some of
the variables.

Example: Map-Coloring
• Variables WA, NT, Q, NSW, V, SA, T
• Domains Di = {red,green,blue}
• Constraints: adjacent regions must have different colors
• C = {SA != WA,SA != NT,SA != Q,SA != NSW,SA != V,
• WA != NT,NT != Q,Q != NSW,NSW != V } .
• e.g., WA ≠ NT, or (WA,NT) in {(red,green),(red,blue),(green,red),
(green,blue),(blue,red),(blue,green)}

Example: Map-Coloring
• Solutions are complete and consistent assignments
• e.g., WA = red, NT = green, Q = red, NSW = green,V =
red,SA = blue,T = green
•

Constraint graph
• Binary CSP: each constraint relates two variables
• Constraint graph: nodes are variables, arcs are constraints

Job-shop Scheduling
• small part of the car assembly, consisting of 15 tasks:
• install axles (frontand back), affix all four wheels (right and left,
front and back), tighten nuts for each wheel, affix hubcaps, and
inspect the final assembly. We can represent the tasks with 15
variables:
• X = {AxleF , AxleB,WheelRF ,WheelLF ,WheelRB,WheelLB,
NutsRF ,NutsLF , NutsRB, NutsLB, CapRF , CapLF , CapRB, CapLB,
Inspect} .
• The value of each variable is the time that the task starts. Next
we represent precedence constraints between individual
tasks.
• Whenever a task T1 must occur before task T2 ,task T1 takes
duration d1 to complete, we add an arithmetic constraint of the
form
• T1 + d1 ≤ T2 .

• The axles have to be in place before the wheels are put on,
and it takes 10 minutes to install an axle, so we write
• AxleF + 10 ≤ WheelRF ; AxleF + 10 ≤ WheelLF ;
• AxleB + 10 ≤ WheelRB; AxleB +10 ≤ WheelLB .
• Next we say that, for each wheel, we must affix the wheel
(which takes 1 minute), then tighten the nuts (2 minutes),
and finally attach the hubcap (1 minute, but not
represented yet):
• WheelRF + 1 ≤ NutsRF ; NutsRF + 2 ≤ CapRF ;
• WheelLF + 1 ≤ NutsLF ; NutsLF +2 ≤ CapLF ;
• WheelRB + 1 ≤ NutsRB; NutsRB + 2 ≤ CapRB;
• WheelLB + 1 ≤ NutsLB; NutsLB + 2 ≤ CapLB .

• Suppose we have four workers to install wheels, but they have to share
one tool that helps put the axle in place. We need a disjunctive
constraint to say that AxleF and AxleB must not overlap in time; either
one comes first or the other does:
• (AxleF + 10 ≤ AxleB) or (AxleB + 10 ≤ AxleF ) .
• This looks like a more complicated constraint, combining arithmetic
and logic. But it still reduces to a set of pairs of values that AxleF and
AxleF can take on.
• We also need to assert that the inspection comes last and takes 3
minutes. For every variable except Inspect we add a constraint of the
form X +dX ≤ Inspect .
• Finally, suppose there is a requirement to get the whole assembly done
in 30 minutes.
• We can achieve that by limiting the domain of all variables:
• Di = {1, 2, 3, . . . , 27} .
• This particular problem is trivial to solve, but CSPs have been applied
to job-shop scheduling problems like this with thousands of variables.

Varieties of CSPs
• Discrete variables
o finite domains:
• n variables, domain size d  O(dn) complete assignments
• e.g., Boolean CSPs, incl. Boolean satisfiability (NP-complete)
o infinite domains:
• integers, strings, etc.
• e.g., job scheduling, variables are start/end days for each job
• need a constraint language, e.g., StartJob1 + 5 ≤ StartJob3
• Continuous variables
o e.g., start/end times for Hubble Space Telescope observations
o linear constraints solvable in polynomial time by LP

Varieties of constraints
• Unary constraints involve a single variable,
o e.g., SA ≠ green
• Binary constraints involve pairs of variables,
o e.g., SA ≠ WA
• Higher-order constraints involve 3 or more
variables,
o e.g., cryptarithmetic column constraints

Problem
• CSPs can be solved by general-purpose search algorithms, but because
of
• their special structure, algorithms designed specifically for CSPs
generally perform much better.
• Constraints come in several varieties. Unary constraints concern the
value of a single variable.
• For example, the variables corresponding to the leftmost digit on any
row of a cryptarithmetic puzzle are constrained not to have the value 0.
Binary constraints relate pairs of variables.
• The constraints in the 8-queens problem are all binary constraints.
Higher-order constraints involve three or more variables—
• for example, the columns in the cryptarithmetic problem must obey
• an addition constraint and can involve several variables.
• Finally, constraints can be absolute constraints, violation of which rules
out a potential solution, or preference constraints that say which
solutions are preferred.

Knowledge Representation
& Reasoning
• Knowledge bases = no. of sentences
• Declarative approach to building an agent
• Knowledge-based agents are able to accept new tasks in the
form of explicitly described goals; they can achieve competence
quickly by being told or learning new knowledge about the
environment; and they can adapt to changes in the environment
by updating the relevant knowledge.
• A knowledge-based agent needs to know many things: the
current state of the world; how to infer unseen properties of the
world from percepts; how the world evolves over time; what it
wants to achieve; and what its own actions do in various
circumstances.

Knowledge Representation
• The object of knowledge representation is to express
knowledge in computer-tractable form, such that it caiTbe
used to help agents perform well.
• A knowledge representation language is defined by two
aspects:
• The syntax
• The semantics

Types of Logic
• Propositional Logic
• Predicate Logic (or) First-order Logic
• Temporal Logic
• Probability Theory
• Fuzzy Logic

Propositional Logic
• Syntax
• The syntax of prepositional logic is simple.
• The symbols of prepositional logic are the logical constants True and False,
proposition symbols such as P and Q, the logical connectives A, V, •&,
• =>, and -i, and parentheses, (). All sentences are made by putting these symbols
together using the following rules:
• • The logical constants True and False are sentences by themselves.
• • A prepositional symbol such as P or Q is a sentence by itself.
• • Wrapping parentheses around a sentence yields a sentence, for example, (P A Q).
• • A sentence can be formed by combining simpler sentences with one of the five
logical connectives:
• A (and). A sentence whose main connective is A, such as P A (Q V R), is called a
• conjunction (logic); itsparts are the conjuncts. (The A looks like an "A" for "And.")
• V (or). A sentence using V, such as A V (P A Q), is a disjunction of the disjuncts A
• and (P A Q). (Historically, the V comes from the Latin "vel," which means "or." For
• most people, it is easier to remember as an upside-down and.)

First Order Logic
• Three main components :
• Objects : nouns or noun phrases
• Relations : verbs or verbs phrases
• Functions :

Quantifiers
• A quantifier is a language element which generates
quantification, and quantification specifies the quantity of
specimen in the universe of discourse.
• These are the symbols that permit to determine or identify
the range and scope of the variable in the logical
expression. There are two types of quantifier:
o Universal Quantifier, (for all, everyone, everything)
o Existential quantifier, (for some, at least one).

• Universal quantifier is a symbol of logical representation,
which specifies that the statement within its range is true
for everything or every instance of a particular thing.
• The Universal quantifier is represented by a symbol ∀,
which resembles an inverted A.
• Note: In universal quantifier we use implication "→".
• If x is a variable, then ∀x is read as:
• For all x
• For each x
• For every x.
Universal Quantifier

Existential Quantifier:
• Existential quantifiers are the type of quantifiers, which
express that the statement within its scope is true for at
least one instance of something.
• It is denoted by the logical operator ∃, which resembles as
inverted E. When it is used with a predicate variable then it
is called as an existential quantifier.
• Note: In Existential quantifier we always use AND or
Conjunction symbol (∧).
• If x is a variable, then existential quantifier will be ∃x or
∃(x). And it will be read as:
• There exists a 'x.'
• For some 'x.'
• For at least one 'x.'

Nested Quantifiers
• We will often want to express more complex sentences
using multiple quantifiers.
• The simplest case is where the quantifiers are of the same
type.
• For example, “Brothers are siblings” can be written as
• ∀ x ∀ y Brother (x, y) ∀ Sibling(x, y) .
• Consecutive quantifiers of the same type can be
written as one quantifier with several variables.
• For example, to say that siblinghood is a
symmetric relationship, we can write
• ∀ x, y Sibling(x, y) ∀ Sibling(y, x) .

Uncertainity
• Till now, we have learned knowledge representation using
first-order logic and propositional logic with certainty,
which means we were sure about the predicates. With this
knowledge representation, we might write A→B, which
means if A is true then B is true, but consider a situation
where we are not sure about whether A is true or not then
we cannot express this statement, this situation is called
uncertainty.
• So to represent uncertain knowledge, where we are not
sure about the predicates, we need uncertain reasoning or
probabilistic reasoning.

Uncertain Reasoning
• So far in course, everything deterministic
• If I walk with my umbrella, I will not get wet
• But: there is some chance my umbrella will break!
• Intelligent systems must take possibility of failure into
account…
o May want to have backup umbrella in city that is often windy
and rainy
• … but should not be excessively conservative
o Two umbrellas not worthwhile for city that is usually not
windy
• Need quantitative notion of uncertainty

Causes for Uncertainity
1. Information occurred from unreliable sources.
2. Experimental Errors
3. Equipment fault
4. Temperature variation
5. Climate change.

• Let’s consider an example of uncertain reasoning: diagnosing a dental patient’s
toothache.
• Diagnosis—whether for medicine, automobile repair, or whatever—almost
always involves uncertainty.
• Let us try to write rules for dental diagnosis using propositional logic, so that
we can see how the logical approach breaks down. Consider the following
simple rule:
• Toothache ⇒ Cavity .
• The problem is that this rule is wrong. Not all patients with toothaches have
cavities; some of them have gum disease, an abscess, or one of several other
problems:
• Toothache ⇒ Cavity ∨ GumProblem ∨ Abscess . . .
• Unfortunately, in order to make the rule true, we have to add an almost
unlimited list of possible problems. We could try turning the rule into a causal
rule:
• Cavity ⇒ Toothache .
• But this rule is not right either; not all cavities cause pain. The only way to fix
the rule is to make it logically exhaustive: to augment the left-hand side with all
the qualifications required for a cavity to cause a toothache.

• Trying to use logic to cope with a domain like
medical diagnosis thus fails for three main reasons:
• Laziness: It is LAZINESS too much work to list the
complete set of antecedents or consequents needed to
ensure an exceptionless rule and too hard to use such
rules.
• Theoretical ignorance: Medical science has no complete
theory for the domain.
• Practical ignorance: Even if we know all the rules, we
might be uncertain about a particular patient because not
all the necessary tests have been or can be run.

Probabilistic Reasoning
• The agent’s knowledge can at best provide only a degree
DEGREE OF BELIEF of belief in the relevant sentences. Our
main tool for dealing with degrees of belief is probability
theory.
• Probability provides a way of summarizing the uncertainty
that comes from our laziness and ignorance, thereby solving
the qualification problem.

Probabilistic Reasoning
• Probabilistic reasoning is a way of knowledge representation where we apply
the concept of probability to indicate the uncertainty in knowledge. In
probabilistic reasoning, we combine probability theory with logic to handle the
uncertainty.
• We use probability in probabilistic reasoning because it provides a way to
handle the uncertainty that is the result of someone's laziness and ignorance.
• In the real world, there are lots of scenarios, where the certainty of something
is not confirmed, such as "It will rain today," "behavior of someone for some
situations," "A match between two teams or two players." These are probable
sentences for which we can assume that it will happen but not sure about it, so
here we use probabilistic reasoning.
• Need of probabilistic reasoning in AI:
• When there are unpredictable outcomes.
• When specifications or possibilities of predicates becomes too large to handle.
• When an unknown error occurs during an experiment.
• In probabilistic reasoning, there are two ways to solve problems with
uncertain knowledge:
• Bayes' rule
• Bayesian Statistics

Conditional probability:
• Conditional probability is a probability of occurring an
event when another event has already happened.
• Let's suppose, we want to calculate the event A when event
B has already occurred, "the probability of A under the
conditions of B", it can be written as:
• Where P(A∀B)= Joint probability of a and B
• P(B)= Marginal probability of B.
• If the probability of A is given and we need to find the
probability of B, then it will be given as:
• It can be explained by using the below Venn diagram,
where B is occurred event, so sample space will be reduced
to set B, and now we can only calculate event A when event
B is already occurred by dividing the probability of P(A∀B)
by P( B ).

Probability
• Example: roll two dice
• Random variables:
– X = value of die 1
– Y = value of die 2
• Outcome is represented by an
ordered pair of values (x, y)
– E.g., (6, 1): X=6, Y=1
– Atomic event or sample point tells
us the complete state of the world,
i.e., values of all random variables
• Exactly one atomic event will
happen; each atomic event has a
≥0 probability; sum to 1
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
• An event is a proposition about
the state (=subset of states)
– X+Y = 7
• Probability of event = sum of
probabilities of atomic events
where event is true
1
2
3
4
5
6
1 2 3 4 5 6
X
Y

Cards and combinatorics
• Draw a hand of 5 cards from a standard deck with 4*13 = 52
cards (4 suits, 13 ranks each)
• Each of the (52 choose 5) hands has same probability 1/(52
choose 5)
• Probability of event = number of hands in that event / (52
choose 5)
• What is the probability that…
o no two cards have the same rank?
o you have a flush (all cards the same suit?)
o you have a straight (5 cards in order of rank, e.g., 8, 9, 10, J, Q)?
o you have a straight flush?
o you have a full house (three cards have the same rank and the two
other cards have the same rank)?

events
• If events A and B are disjoint, then
o P(A or B) = P(A) + P(B)
• More generally:
o P(A or B) = P(A) + P(B) - P(A and B)
• If events A1, …, An are disjoint and exhaustive (one of them must
happen) then P(A1) + … + P(An) = 1
o Special case: for any random variable, ∑x P(X=x) = 1
• Marginalization: P(X=x) = ∑y P(X=x and Y=y)

Conditional probability
• We might know something about the world – e.g., “X+Y=6 or X+Y=7” –
given this (and only this), what is the probability of Y=5?
• Part of the sample space is eliminated; probabilities are renormalized
to sum to 1
1/11 0 0 0 0 0
1/11 1/11 0 0 0 0
0 1/11 1/11 0 0 0
0 0 1/11 1/11 0 0
0 0 0 1/11 1/11 0
0 0 0 0 1/11 1/11
1
2
3
4
5
6
1 2 3 4 5 6 X
Y
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1/36 1/36 1/36 1/36 1/36 1/36
1
2
3
4
5
6
1 2 3 4 5 6 X
Y
• P(Y=5 | (X+Y=6) or (X+Y=7)) = 2/11

probability
• P(A | B) = P(A and B) / P(B)
A
B
A and B
Sample space
• P(A | B)P(B) = P(A and B)
• P(A | B) = P(B | A)P(A)/P(B)
– Bayes’ rule

cards
• Given that your first two cards are Queens,
what is the probability that you will get at least
3 Queens?
• Given that you have at least two Queens (not
necessarily the first two), what is the
probability that you have at least three Queens?
• Given that you have at least two Queens, what is
the probability that you have three Kings?

How can we scale this?
• In principle, we now have a complete approach
for reasoning under uncertainty:
o Specify probability for every atomic event,
o Can compute probabilities of events simply by summing probabilities of atomic events,
o Conditional probabilities are specified in terms of probabilities of events: P(A | B) =
P(A and B) / P(B)
• If we have n variables that can each take k values, how many
atomic events are there?

Independence
• Some variables have nothing to do with each other
• Dice: if X=6, it tells us nothing about Y
• P(Y=y | X=x) = P(Y=y)
• So: P(X=x and Y=y) = P(Y=y | X=x)P(X=x) =
P(Y=y)P(X=x)
o Usually just write P(X, Y) = P(X)P(Y)
o Only need to specify 6+6=12 values instead of 6*6=36
values
o Independence among 3 variables: P(X,Y,Z)=P(X)P(Y)P(Z), etc.
• Are the events “you get a flush” and “you get a
straight” independent?

dice
• What is the probability of
o Rain in Beaufort? Rain in Durham?
o Rain in Beaufort, given rain in Durham?
o Rain in Durham, given rain in Beaufort?
• Rain in Beaufort and rain in Durham are correlated
.2 .1
.2 .5
Rain in Durham
Rain in Beaufort
Sun in Durham
Sun in Beaufort
(disclaimer: no
idea if these
numbers are
realistic)

Axioms of Probability
• The following axioms are in fact sufficient:
1. All probabilities are between 0 and 1.
0 < P(A) < 1
2. Necessarily true (i.e., valid) propositions have probability 1,
and necessarily false (i.e.,unsatisfiable) propositions have
probability 0.
P(True) = 1 P(False) = 0
3. The probability of a disjunction is given by
P(A V 5) = P(A) + P(B) - P(A A B)

PPT ON INTRODUCTION TO AI- UNIT-1-PART-2.pptx

Recommended

Recommended

More Related Content

Similar to PPT ON INTRODUCTION TO AI- UNIT-1-PART-2.pptx

Similar to PPT ON INTRODUCTION TO AI- UNIT-1-PART-2.pptx (20)

More from RaviKiranVarma4

More from RaviKiranVarma4 (14)

Recently uploaded

Recently uploaded (20)

PPT ON INTRODUCTION TO AI- UNIT-1-PART-2.pptx

Editor's Notes