Ai complete note
Upcoming SlideShare
Loading in...5
×
 

Ai complete note

on

  • 1,338 views

TU, complete AI notes...compiled by Najar Aryal..

TU, complete AI notes...compiled by Najar Aryal..

Statistics

Views

Total Views
1,338
Views on SlideShare
1,338
Embed Views
0

Actions

Likes
2
Downloads
136
Comments
1

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Ai complete note Ai complete note Document Transcript

  • Chapter -1: IntroductionWhat is artificial intelligence? It is the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable.It is Duplication of human thought process by machine Learning from experience Interpreting ambiguities Rapid response to varying situations Applying reasoning to problem-solving Manipulating environment by applying knowledge Thinking and reasoningYes, but what is intelligence? Intelligence is the computational part of the ability to achieve goals in the world. Varying kinds and degrees of intelligence occur in people, many animals and some machines.Isnt there a solid definition of intelligence that doesnt depend on relating it to humanintelligence? Not yet. The problem is that we cannot yet characterize in general what kinds of computational procedures we want to call intelligent. We understand some of the mechanisms of intelligence and not others.Acting humanly: The Turing Test approach Fig. The imitation gameAbridged history of AI(summary) 1943 McCulloch & Pitts: Boolean circuit model of brain 1950 Turings "Computing Machinery and Intelligence" 1956 Dartmouth meeting: "Artificial Intelligence" adopted 1950s Early AI programs, including Samuels checkers program, Newell & Simons Logic Theorist, Gelernters Geometry Engine 1965 Robinsons complete algorithm for logical reasoning 1966—73 AI discovers computational complexity, neural network research almost disappears 1969—79 early development of knowledge-based systems 1980-- AI becomes an industry 1986-- Neural networks return to popularity 1987-- AI becomes a science 1995-- The emergence of intelligent agentsPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 1
  • √ Goals of AI Replicate human intelligence "AI is the study of complex information processing problems that often have their roots in some aspect of biological information processing. The goal of the subject is to identify solvable and interesting information processing problems, and solve them." -- David Marr Solve knowledge-intensive tasks "AI is the design, study and construction of computer programs that behave intelligently." -- Tom Dean "... to achieve their full impact, computer systems must have more than processing power-- they must have intelligence. They need to be able to assimilate and use large bodies of information and collaborate with and help people find new ways of working together effectively. The technology must become more responsive to human needs and styles of work, and must employ more natural means of communication." -- Barbara Grosz and Randall Davis Intelligent connection of perception and action AI not centered around representation of the world, but around action in the world. Behavior- based intelligence. (see Rod Brooks in the movie Fast, Cheap and Out of Control) Enhance human-human, human-computer and computer-computer interaction/communication Computer can sense and recognize its users, see and recognize its environment, respond visually and audibly to stimuli. New paradigms for interacting productively with computers using speech, vision, natural language, 3D virtual reality, 3D displays, more natural and powerful user interfaces, etc. (See, for example, projects in Microsofts "Advanced Interactivity and Intelligence" group.)Some Application Areas of AI Game Playing Deep Blue Chess program beat world champion Gary Kasparov Speech Recognition PEGASUS spoken language interface to American Airlines EAASY SABRE reseration system, which allows users to obtain flight information and make reservations over the telephone. The 1990s has seen significant advances in speech recognition so that limited systems are now successful. Computer Vision Face recognition programs in use by banks, government, etc. The ALVINN system from CMU autonomously drove a van from Washington, D.C. to San Diego (all but 52 of 2,849 miles), averaging 63 mph day and night, and in all weather conditions. Handwriting recognition, electronics and manufacturing inspection, photointerpretation, baggage inspection, reverse engineering to automatically construct a 3D geometric model. Expert Systems Application-specific systems that rely on obtaining the knowledge of human experts in an area and programming that knowledge into a system. o Diagnostic Systems Microsoft Office Assistant in Office 97 provides customized help by decision- theoretic reasoning about an individual user. MYCIN system for diagnosing bacterial infections of the blood and suggesting treatments. Intellipath pathology diagnosisPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 2
  • system (AMA approved). Pathfinder medical diagnosis system, which suggests tests and makes diagnoses. Whirlpool customer assistance center. o System Configuration DECs XCON system for custom hardware configuration. Radiotherapy treatment planning. o Financial Decision Making Credit card companies, mortgage companies, banks, and the U.S. government employ AI systems to detect fraud and expedite financial transactions. For example, AMEX credit check. Systems often use learning algorithms to construct profiles of customer usage patterns, and then use these profiles to detect unusual patterns and take appropriate action. o Classification Systems Put information into one of a fixed set of categories using several sources of information. E.g., financial decision making systems. NASA developed a system for classifying very faint areas in astronomical images into either stars or galaxies with very high accuracy by learning from human experts classifications. Mathematical Theorem Proving Use inference methods to prove new theorems. Natural Language Understanding AltaVistas translation of web pages. Translation of Catepillar Truck manuals into 20 languages. (Note: One early system translated the English sentence "The spirit is willing but the flesh is weak" into the Russian equivalent of "The vodka is good but the meat is rotten.") Scheduling and Planning Automatic scheduling for manufacturing. DARPAs DART system used in Desert Storm and Desert Shield operations to plan logistics of people and supplies. American Airlines rerouting contingency planner. European space agency planning and scheduling of spacecraft assembly, integration and verification.Some AI "Grand Challenge" Problems Translating telephone Accident-avoiding car Aids for the disabled Smart clothes Intelligent agents that monitor and manage information by filtering, digesting, abstracting Tutors Self-organizing systems, e.g., that learn to assemble something by observing a human do it.A Framework for Building AI Systems PerceptionIntelligent biological systems are physically embodied in the world and experience the world throughtheir sensors (senses). For an autonomous vehicle, input might be images from a camera and rangeinformation from a rangefinder. For a medical diagnosis system, perception is the set of symptomsand test results that have been obtained and input to the system manually. Includes areas of vision,speech processing, natural language processing, and signal processing (e.g., market data and acousticdata). ReasoningInference, decision-making, classification from what is sensed and what the internal "model" is of theworld. Might be a neural network, logical deduction system, Hidden Markov Model induction,heuristic searching a problem space, Bayes Network inference, genetic algorithms, etc. Includes areasof knowledge representation, problem solving, decision theory, planning, game theory, machinelearning, uncertainty reasoning, etc.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 3
  •  ActionBiological systems interact within their environment by actuation, speech, etc. All behavior iscentered around actions in the world. Examples include controlling the steering of a Mars rover orautonomous vehicle, or suggesting tests and making diagnoses for a medical diagnosis system.Includes areas of robot actuation, natural language generation, and speech synthesis.Some Fundamental Issues for Most AI Problems Representation Facts about the world have to be represented in some way, e.g., mathematical logic is one language that is used in AI. Deals with the questions of what to represent and how to represent it. How to structure knowledge? What is explicit, and what must be inferred? How to encode "rules" for inferencing so as to find information that is only implicitly known? How to deal with incomplete, inconsistent, and probabilistic knowledge? Epistemology issues (what kinds of knowledge are required to solve problems). Example: "The fly buzzed irritatingly on the window pane. Jill picked up the newspaper." Inference: Jill has malicious intent; she is not intending to read the newspaper, or use it to start a fire, or ... Example: Given 17 sticks in 3 x 2 grid, remove 5 sticks to leave exactly 3 squares. Search Many tasks can be viewed as searching a very large problem space for a solution. For example, Checkers has about 1040 states, and Chess has about 10120 states in a typical games. Use of heuristics (meaning "serving to aid discovery") and constraints. Inference From some facts others can be inferred. Related to search. For example, knowing "All elephants have trunks" and "Clyde is an elephant," can we answer the question "Does Clyde hae a trunk?" What about "Peanuts has a trunk, is it an elephant?" Or "Peanuts lives in a tree and has a trunk, is it an elephant?" Deduction, abduction, non-monotonic reasoning, reasoning under uncertainty. Learning Inductive inference, neural networks, genetic algorithms, artificial life, evolutionary approaches. Planning Starting with general facts about the world, facts about the effects of basic actions, facts about a particular situation, and a statement of a goal, generate a strategy for achieving that goals in terms of a sequence of primitive steps or actions. The State of the Art Computer beats human in a chess game. Computer-human conversation using speech recognition. Computer program can chat with human Expert system controls a spacecraft. Robot can walk on stairs and hold a cup of water. Language translation for webpages. Home appliances use fuzzy logic.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 4
  • Agent and Environment An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators. A human agent has eyes, ears, and other organs for sensors and hands, legs, mouth, and other body parts for actuators. A robotic agent might have cameras and infrared range finders for sensors and various motors for actuators. A software agent receives keystrokes, file contents, and network packets as sensory inputs and acts on the environment by displaying on the screen, writing files, and sending network packets. We will make the general assumption that every agent can perceive its own actions (but not always the effects). We use the term percept to refer to the agents perceptual inputs at any given instant. An agents percept sequence is the complete history of everything the agent has ever perceived. In general, an agents choice of action at any given instant can depend on the entire percept sequence observed to date If we can specify the agents choice of action for every possible percept sequence, then we have said more or less everything there is to say about the agent. Mathematically speaking, we say that an agents behavior is described by the agent function that maps any given percept sequence to an action. f : P * A The agent program runs on the physical architecture to produce fPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 5
  • Fig. Agents interact with environments through sensors and actuators Fig. Vacuum cleaner world Percepts: location and contents, e.g., [A, Dirty] Actions: Left, Right, Suck, NoOp For Vacuum Cleaner Agent: Percept sequence Action [A, Clean] Right [A, Dirty] Suck [B, Clean] Left [B, Dirty] Suck [A, Clean], [A, Clean] Right [A, Clean], [A, Dirty] Suck … function Reflex-Vacuum-Agent( [location,status]) returns an action if status = Dirty then return Suck else if location =A then return Right else if location = B then return LeftRationalityDefinition of Rational Agent:For each possible percept sequence, a rational agent should select an action that is expected to maximize itsperformance measure, given the evidence provided by the percept sequence and whatever built-in knowledgethe agent has.Rational ≠ omniscient (percepts may not supply all relevant information)Rational ≠ clairvoyant (action outcomes may not be as expected)Hence, rational ≠ successfulPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 6
  • Rational exploration, learning, autonomyPEAS (Performance measure, Environment, Actuators, Sensors)To design a rational agent, we must specify the task environments. Task environments areessentially the "problems" to which rational agents are the "solutions." Those taskenvironments come in a variety of flavors and the flavor of the task environment directlyaffects the appropriate design for the agent program.Consider, e.g., the task of designing an automated taxi:Agent Type Performance Environment Actuators Sensors MeasureTaxi driver Safe, fast, legal, Roads, other traffic, Steering, accelerator, Cameras, sonar, comfortable trip, pedestrians, customers brake, signal, horn, speedometer, GPS, maximize profits display odometer, accelerometer, engine sensors, keyboard Figure PEAS description of the task environment for an automated taxi.Agent Type Performance Environment Actuators Sensors MeasureMedical Healthy patient, Patient, hospital, staff Display Keyboard entry ofdiagnosis system minimize costs, questions, tests, symptoms, findings, lawsuits diagnoses, treatments, patients answers referralsInternet Shopping Price, quality, www sites, vendors, Display to user, follow HTML pages (text,Agent appropriateness, shippers URL, fill in form graphics, scripts) efficiencyThe range of task environments that might arise in AI is obviously vast. We can, however,identify a fairly small number of dimensions along which task environments can be catego-rized.Fully observable vs. partially observable:If an agents sensors give it access to the complete state of the environment at each point intime, then we say that the task environment is fully observable. An environment might bepartially observable because of noisy and inaccurate sensors or because parts of the state aresimply missing from the sensor data For example, a vacuum agent with only a local dirtsensor cannot tell whether there is dirt in other squares, and an automated taxi cannot seewhat other drivers are thinking.Deterministic vs. stochastic.If the next state of the environment is completely determined by the current state and theaction executed by the agent, then we say the environment is deterministic; otherwise, it isstochastic.Episodic vs. sequential.In an episodic task environment, the agents experience is divided into atomic episodes.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 7
  • Each episode consists of the agent perceiving and then performing a single action. Crucially,the next episode does not depend on the actions taken in previous episodes. In episodicenvironments, the choice of action in each episode depends only on the episode itself. Insequential environments, on the other hand, the current decision could affect all futuredecisions. Chess and taxi driving are sequential: in both cases, short-term actions can havelong-term consequences. Episodic environments are much simpler than sequentialenvironments because the agent does not need to think ahead.Static vs. dynamic.If the environment can change while an agent is deliberating, then we say the environ-ment is dynamic for that agent; otherwise, it is static. If the environment itself does notchange with the passage of time but the agents performance score does, then we say theenvironment is semidynamic. Taxi driving is clearly dynamic: the other cars and the taxiitself keep moving while the driving algorithm dithers about what to do next. Chess, whenplayed with a clock, is semidynamic. Crossword puzzles are static.Discrete vs. continuous.The discrete/continuous distinction can be applied to the state of the environment, to the waytime is handled, and to the percepts and actions of the agent. For example, a discrete-stateenvironment such as a chess game has a finite number of distinct states. Chess also has adiscrete set of percepts and actions. Taxi driving is a continuous-stateSingle agent vs. multiagent.Single agent and multiagent environment is differentiated by observing no. of agents in theenvironment. For example, an agent solving a crossword puzzle by itself is clearly in asingle-agent environment, whereas an agent playing chess is in a two-agent environment.As one might expect, the hardest case is partially observable, stochastic, sequential, dynamic,continuous, and multiagent. The real world is partially observable, stochastic, sequential,dynamic, continuous, multi-agent.There are four basic kinds of agent program that embody the principles underlying almost allintelligent systems. All these can be turned into learning agents • Simple reflex agents; • Model-based reflex agents; • Goal-based agents; and• Utility-based agents.All these can be turned into learning agents.Agent types; simple reflex Select action on the basis of only the current percept.E.g. the vacuum-agent Large reduction in possible percept/action situations. Implemented through condition-action rules If dirty then suckPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 8
  • function REFLEX-VACUUM-AGENT ([location, status]) return an actionif status == Dirty then return Suckelse if location == A then return Rightelse if location == B then return LeftReduction from 4T to 4 entriesAgent types; reflex and state To tackle partially observable environments. Maintain internal state Over time update state using world knowledge How does the world change. How do actions affect world. ⇒Model of WorldAgent types; goal-based The agent needs a goal to know which situations are desirable. o Things become difficult when long sequences of actions are required to find the goal. Typically investigated in search and planning research. Major difference: future is taken into account Is more flexible since knowledge is represented explicitly and can be manipulated.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 9
  • Agent types; utility-based Certain goals can be reached in different ways. o Some are better, have a higher utility. Utility function maps a (sequence of) state(s) onto a real number. Improves on goals: o Selecting between conflicting goals o Select appropriately between several goals based on likelihood of success.Agent types; learning All previous agent-programs describe methods for selecting actions. o Yet it does not explain the origin of these programs. o Learning mechanisms can be used to perform this task. o Teach them instead of instructing them. o Advantage is the robustness of the program toward initially unknown environments. Learning element: introduce improvements in performance element. Critic provides feedback on agents performance based on fixed performance standard. Performance element: selecting actions based on percepts. Corresponds to the previous agent programsPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 10
  • Problem generator: suggests actions that will lead to new and informative experiences. Exploration vs. exploitation KNOWLEDGE • Data = collection of facts, measurements, statistics • Information = organized data • Knowledge = contextual, relevant, actionable information – Strong experiential and reflective elements – Good leverage and increasing returns – Dynamic – Branches and fragments with growth – Difficult to estimate impact of investment – Uncertain value in sharing – Evolves over time with experience • Explicit knowledge – Objective, rational, technical – Policies, goals, strategies, papers, reports – Codified – Leaky knowledge • Tacit knowledge – Subjective, cognitive, experiential learning – Highly personalized – Difficult to formalize – Sticky knowledgePrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 11
  • Chapter 2 :Problem SolvingProblem-solving agentFour general steps in problem solving: Goal formulation o What are the successful world states Problem formulation o What actions and states to consider to give the goal Search o Determine the possible sequence of actions that lead to the states of known values and then choosing the best sequence. Execute o Give the solution perform the actions.function SIMPLE-PROBLEM-SOLVING-AGENT(percept) return an actionstatic: seq, an action sequencestate, some description of the current world stategoal, a goalproblem, a problem formulationstate UPDATE-STATE(state, percept)if seq is empty thengoal FORMULATE-GOAL(state)problem FORMULATE-PROBLEM(state,goal)seq SEARCH(problem)action FIRST(seq)seq REST(seq)return actionEXAMPLE: On holiday in Romania; currently in Arad o Flight leaves tomorrow from Bucharest Formulate goal o Be in Bucharest Formulate problem o States: various cities o Actions: drive between cities Find solution o Sequence of cities; e.g. Arad, Sibiu, Fagaras, Bucharest, …Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 12
  • Selecting a state space Real world is absurdly complex. State space must be abstracted for problem solving. (Abstract) state = set of real states. (Abstract) action = complex combination of real actions. o e.g. Arad ®Zerind represents a complex set of possible routes, detours, rest stops, etc. o The abstraction is valid if the path between two states is reflected in the real world. (Abstract) solution = set of real paths that are solutions in the real world. _ Each abstract action should be ―easier‖ than the real problem.Formulating Problem as a GraphIn the graph each node represents a possible state; a node is designated as the initial state; one or more nodes represent goal states, states in which the agent‘s goal is considered accomplished. each edge represents a state transition caused by specific agent action; associated to each edge is the cost of performing that transition.State space graph of vacuum worldExample: vacuum world States?? two locations with or without dirt: 2 x 22=8 states. Initial state?? Any state can be initial Actions?? {Left, Right, Suck} Goal test?? Check whether squares are clean. o Path cost?? Number of actions to reach goal.Example: 8-puzzle States?? Integer location of each tile Initial state?? Any state can be initial Actions?? {Left, Right, Up, Down} Goal test?? Check whether goal configuration is reached o Path cost?? Number of actions to reach goalPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 13
  • Problem Solving as SearchSearch space: set of states reachable from an initial state S0 via a (possibly empty/finite/infinite)sequence of state transitions.To achieve the problem‘s goal search the space for a (possibly optimal) sequence of transitions starting from S0 and leading to a goal state; execute (in order) the actions associated to each transition in the identified sequence.Depending on the features of the agent‘s world the two steps above can be interleaved.How do we reach a goal state?There may be several possible ways. Or none!Factors to consider: cost of finding a path; cost of traversing a path.Problem Solving as Search Reduce the original problem to a search problem. A solution for the search problem is a path initial state–goal state. The solution for the original problem is either o the sequence of actions associated with the path o Or the description of the goal state.Example: The 8-puzzleIt can be generalized to 15-puzzle, 24-puzzle, or (n2 − 1)-puzzle for n ≥ 6. States: configurations of tilesPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 14
  • Operators: move one tile Up/Down/Left/Right There are 9! = 362, 880 possible states (all permutations of {⊓⊔, 1, 2, 3, 4, 5, 6, 7, 8}). There are 16! possible states for 15-puzzle. Not all states are directly reachable from a given state. (In fact, exactly half of them are reachable from a given state.)How can an artificial agent represent the states and the statespace for this problem?Go from state S to state G.Problem formulation A problem is defined by: o An initial state, e.g. Arad o Successor function S(X)= set of action-state pairs  e.g. S(Arad)={<Arad ® Zerind, Zerind>,…} intial state + successor function = state space o Goal test, can be  Explicit, e.g. x=‗at bucharest‘  Implicit, e.g. checkmate(x) o Path cost (additive)  e.g. sum of distances, number of actions executed, …  c(x,a,y) is the step cost, assumed to be >= 0A solution is a sequence of actions from initial to goal state.Optimal solution has the lowest path cost.Problem formulation1. Choose an appropriate data structure to represent the world states.2. Define each operator as a precondition/effects pair where the precondition holds exactly in the states the operator applies to, effects describe how a state changes into a successor state by the application of the operator.3. Specify an initial state.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 15
  • 4. Provide a description of the goal (used to check if a reached state is a goal state).Formulating the 8-puzzle ProblemStates: each represented by a 3 × 3 array of numbers in [0 . . . 8], where value 0 is for the empty cell. Operators: 24 operators of the form Op(r,c,d) where r, c ∈ {1, 2, 3}, d ∈ {L,R,U,D}. Op(r,c,d) moves the empty space at position (r, c) in the direction d.Example: Op(3,2,R)We have 24 operators in this problem formulation . . .20 too many!Problem types Deterministic, fully observable ⇒single state problem o Agent knows exactly which state it will be in; solution is a sequence. Partial knowledge of states and actions: o Non-observable ⇒sensorless or conformant problem  Agent may have no idea where it is; solution (if any) is a sequence. o Nondeterministic and/or partially observable ⇒contingency problem  Percepts provide new information about current state; solution is a tree or policy; often interleave search and execution.  If uncertainty is caused by actions of another agent: adversarial problem o Unknown state space ⇒exploration problem (―online‖)  When states and actions of the environment are unknown.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 16
  • Problem Solutions need Well-Defined Problems, and Well Defined Problems need to embody explicit solutions on possible solutions: well defined problems must define the space of possible solutions.We use searching to solve well defined problems.Constraint satisfaction problemsWhat is a CSP? Finite set of variables V1, V2, …, Vn Finite set of constraints C1, C2, …, Cm Nonemtpy domain of possible values for each variables DV1, DV2, … DVn Each constraint Ci limits the values that variables can take,  e.g., V1 ≠ V2 A state is defined as an assignment of values to some or all variables. Consistent assignment: assignment does not not violate the constraints. An assignment is complete when every value is mentioned. A solution to a CSP is a complete assignment that satisfies all constraints. Some CSPs require a solution that maximizes an objective function. Applications: Scheduling the time of observations on the Hubble Space Telescope, Floor planning, Map coloring, Cryptography CSPs are a special kind of problem: states defined by values of a fixed set of variables, goal test defined by constraints on variable valuesVarieties of Constraints Unary constraints involve a single variable. e.g. SA ¹ green Binary constraints involve pairs of variables. e.g. SA ¹ WA Higher-order constraints involve 3 or more variables. e.g. cryptharithmetic column constraints. Preference (soft constraints) e.g. red is better than greenoften representable by a cost for each variable assignment constrained optimization problems.CSP example: map coloring Variables: WA, NT, Q, NSW, V, SA, T Domains: Di={red,green,blue} Constraints: adjacent regions must have different colors. o E.g. WA ¹ NT (if the language allows this) o E.g. (WA,NT) ¹ {(red,green),(red,blue),(green,red),…}Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 17
  • Solutions are assignments satisfying all constraints, e.g. {WA=red,NT=green,Q=red,NSW=green,V=red,SA=blue,T=green}Constraint graphCSP benefits Standard representation pattern Generic goal and successor functions Generic heuristics (no domain specific expertise).Constraint graph = nodes are variables, edges show constraints. Graph can be used to simplify search. o e.g. Tasmania is an independent subproblem.Cryptarithmetic conventions Each letter or symbol represents only one digit throughout the problem; When letters are replaced by their digits, the resultant arithmetical operation must be correct; The numerical base, unless specifically stated, is 10; Numbers must not begin with a zero; There must be only one solution to the problem. 1. S E N D + M O R E ------------ M O N E YWe see at once that M in the total must be 1, since the total of the column SM cannot reach ashigh as 20. Now if M in this column is replaced by 1, how can we make this column total asmuch as 10 to provide the 1 carried over to the left below? Only by making S very large: 9 or8. In either case the letter O must stand for zero: the summation of SM could produce only 10or 11, but we cannot use 1 for letter O as we have already used it for M.If letter O is zero, then in column EO we cannot reach a total as high as 10, so that there willbe no 1 to carry over from this column to SM. Hence S must positively be 9.Since the summation EO gives N, and letter O is zero, N must be 1 greater than E and thecolumn NR must total over 10. To put it into an equation: E + 1 = NFrom the NR column we can derive the equation: N + R + (+ 1) = E + 10We have to insert the expression (+ 1) because we don‘t know yet whether 1 is carried overfrom column DE. But we do know that 1 has to be carried over from column NR to EO.Subtract the first equation from the second: R + (+1) = 9Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 18
  • We cannot let R equal 9, since we already have S equal to 9. Therefore we will have to makeR equal to 8; hence we know that 1 has to be carried over from column DE.Column DE must total at least 12, since Y cannot be 1 or zero. What values can we give Dand E to reach this total? We have already used 9 and 8 elsewhere. The only digits left thatare high enough are 7, 6 and 7, 5. But remember that one of these has to be E, and N is 1greater than E. Hence E must be 5, N must be 6, while D is 7. Then Y turns out to be 2, andthe puzzle is completely solved. S E N D 9 5 6 7 + M O R E 1 0 8 5 --------- M O N E Y 1 0 6 5 22. T W O+ T W O _____ F O U R Since, Lets first check with F as 0.Now imagine O with highest possible value 9.Now R must be 8and T should be 4. Now among the remaining numbers if we check then we get U as 3.Thus W mustbe 6, T W O 4 6 9+ T W O 4 6 9 _____ F O U R 0 9 3 8Game PlayingSummary Games are fun (and dangerous) They illustrate several important points about AI Perfection is unattainable -> approximationPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 19
  • Good idea what to think about Uncertainty constrains the assignment of values to states Games are to AI as grand prix racing is to automobile design. Games are a form of multi-agent environment o What do other agents do and how do they affect our success? o Cooperative vs. competitive multi-agent environments. o Competitive multi-agent environments give rise to adversarialproblems a.k.a. games Why study games? o Fun; historically entertaining o Interesting subject of study because they are hard  Chess game:  average branch factor: 35, each player: 50 moves-> Search tree: 35100 nodes Relation of Search and Games Search – no adversary Solution is (heuristic) method for finding goal Heuristics and CSP techniques can find optimal solution Evaluation function: estimate of cost from start to goal through given node Examples: path planning, scheduling activities Games – adversary Solution is strategy (strategy specifies move for every possible opponent reply). Time limits force an approximate solution Evaluation function: evaluate ―goodness‖ of game position Examples: chess, checkers, Othello, backgammonTypes Of Games Multiplayer Games allow more than one playerGame setup Two players: MAX and MIN MAX moves first and they take turns until the game is over. Winner gets award, looser gets penalty. Games as search: o Initial state: e.g. board configuration of chess o Successor function: list of (move,state) pairs specifying legal moves. o Terminal test: Is the game finished? o Utility function: Gives numerical value of terminal states.  E.g. win (+1), loose (-1) and draw (0) in tic-tac-toe (next) MAX uses search tree to determine next move.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 20
  • Partial Game Tree for Tic Tac ToeOptimal strategies Find the contingent strategy for MAX assuming an infallible MIN opponent. Assumption: Both players play optimally !! Given a game tree, the optimal strategy can be determined by using the minimax value of each node:MINIMAX-VALUE(n)= Two-Ply Game Tree Minimax maximizes the worst-case outcome for max.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 21
  • Production SystemProduction systems are applied to problem solving programs that must perform a wide-range ofseaches. Production ssytems are symbolic AI systems. The difference between these two terms isonly one of semantics. A symbolic AI system may not be restricted to the very definition of productionsystems, but they cant be much different either.Production systems are composed of three parts, a global database, production rules and a controlstructure.A production system (or production rule system) is a computer program typically used to providesome form of artificial intelligence, which consists primarily of a set of rules about behavior. Theserules, termed productions, are a basic representation found useful in automated planning, expertsystems and action selection. A production system provides the mechanism necessary to executeproductions in order to achieve some goal for the system.Productions consist of two parts: a sensory precondition (or "IF" statement) and an action (or"THEN"). If a productions precondition matches the current state of the world, then the production issaid to betriggered. If a productions action is executed, it is said to have fired.The first production systems were done by Newell and Simon in the 1950s, and the idea was writtenup in their (1972)."Production" in the title of these notes (or "production rule") is a synonym for "rule", i.e. for acondition-action rule (see below). The term seems to have originated with the term used forrewriting rules in the Chomsky hierarchy of grammar types, where for example context-freegrammar rules are sometimes referred to as context-free productions.RulesThese are also called condition-action rules.These components of a rule-based system have the form:if <condition> then <conclusion>orif <condition> then <action>Example:if patient has high levels of the enzyme ferritin in their blood and patient has the Cys282→Tyr mutation in HFE genethen conclude patient has haemochromatosis** medical validity of this rule is not asserted hereRules can be evaluated by: backward chaining forward chainingPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 22
  • Backward Chaining To determine if a decision should be made, work backwards looking for justifications for the decision. Eventually, a decision must be justified by facts.Forward Chaining Given some facts, work forward through inference net. Discovers what conclusions can be derived from data.Forward Chaining 2Until a problem is solved or no rules if part is satisfied by the current situation: 1. Collect rules whose if parts are satisfied. 2. If more than one rules if part is satisfied, use a conflict resolution strategy to eliminate all but one. 3. Do what the rules then part says to do.Production RulesA production rule system consists ofPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 23
  • a set of rules working memory that stores temporary data a forward chaining inference engineMatch-Resolve-Act CycleThe match-resolve-act cycle is what the inference engine does.loopmatch conditions of rules with contents of working memoryif no rule matches then stopresolve conflictsact (i.e. perform conclusion part of rule)end loopChapter-33.1. Uninformed Search3.1.1 Breadth-first search (BFS) Description  A simple strategy in which the root is expanded first then all the root successors are expanded next, then their successors.  We visit the search tree level by level that all nodes are expanded at a given depth before any nodes at the next level are expanded.  Order in which nodes are expanded. Performance Measure:  Completeness:  it is easy to see that breadth-first search is complete that it visit all levels given that d factor is finite, so in some d it will find a solution.  Optimality:  breadth-first search is not optimal until all actions have the same cost.  Space complexity and Time complexity:  Consider a state space where each node as a branching factor b, the root of the tree generates b 2 3 nodes, each of which generates b nodes yielding b each of these generates b and so on.  In the worst case, suppose that our solution is at depth d, and we expand all nodes but the last node 2 3 4 d+1 d+1 at level d, then the total number of generated nodes is: b + b + b + b + b – b = O(b ), which is the time complexity of BFS.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 24
  •  As all the nodes must retain in memory while we expand our search, then the space complexity is like d+1 the time complexity plus the root node = O(b ). Conclusion:  We see that space complexity is the biggest problem for BFS than its exponential execution time.  Time complexity is still a major problem, to convince your-self look at the table below.3.1.2. Depth-first search (DFS) Description:  DFS progresses by expanding the first child node of the search tree that appears and thus going deeper and deeper until a goal node is found, or until it hits a node that has no children. Then the search backtracks, returning to the most recent node it hasn’t finished exploring.  Order in which nodes are expanded Performance Measure:  Completeness:  DFS is not complete, to convince yourself consider that our search start expanding the left sub tree of the root for so long path (may be infinite) when different choice near the root could lead to a solution, now suppose that the left sub tree of the root has no solution, and it is unbounded, then the search will continue going deep infinitely, in this case we say that DFS is not complete.  Optimality:  Consider the scenario that there is more than one goal node, and our search decided to first expand the left sub tree of the root where there is a solution at a very deep level of this left sub tree, in the same time the right sub tree of the root has a solution near the root, here comes the non-optimality of DFS that it is not guaranteed that the first goal to find is the optimal one, so we conclude that DFS is not optimal.  Time Complexity:  Consider a state space that is identical to that of BFS, with branching factor b, and we start the search from the root.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 25
  •  In the worst case that goal will be in the shallowest level in the search tree resulting in generating all m tree nodes which are O(b ).  Space Complexity:  Unlike BFS, our DFS has a very modest memory requirements, it needs to story only the path from the root to the leaf node, beside the siblings of each node on the path, remember that BFS needs to store all the explored nodes in memory.  DFS removes a node from memory once all of its descendants have been expanded.  With branching factor b and maximum depth m, DFS requires storage of only bm + 1 nodes which d+1 areO(bm) compared to the O(b ) of the BFS. Conclusion:  DFS may suffer from non-termination when the length of a path in the search tree is infinite, so we perform DFS to a limited depth which is called Depth-limited Search.3.1.3 Depth Limited Search• Breadth first has computational, especially, space problems. Depth first can run off down a very long (or infinite) path..• Idea: introduce a depth limit on branches to be expanded.• Don‘t expand a branch below this depth.• Most useful if you know the maximum depth of the solution.  Perform depth first search but only to a pre-specified depth limit L.  No node on a path that is more than L steps from the initial state is placed on the Frontier.  We ―truncate‖ the search by looking only at paths of length L or less. Description:  The unbounded tree problem appeared in DFS can be fixed by imposing a limit on the depth that DFS can reach, this limit we will call depth limit l, this solves the infinite path problem. Performance Measure:  Completeness:  The limited path introduces another problem which is the case when we choose l < d, in which is our DLS will never reach a goal, in this case we can say that DLS is not complete.  Optimality:  One can view DFS as a special case of the depth DLS, that DFS is DLS with l = infinity.  DLS is not optimal even if l > d. l  Time Complexity: O(b )  Space Complexity: O(bl) Conclusion:  DLS can be used when the there is a prior knowledge to the problem, which is always not the case, Typically, we will not know the depth of the shallowest goal of a problem unless we solved this problem before.It is Depth First -search with depth limit l.  i.e. nodes at depth l have no successors.  Problem knowledge can be used Solves the infinite-path problem. If l < d then incompleteness results. If l > d then not optimal. Time complexity: O(bl ) Space complexity: O(bl )Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 26
  • Advantages Will always terminate Will find solution if there is one in the depth boundDisadvantages • Too small a depth bound misses solutions • Too large a depth bound may find poor solutions when there are better ones3.1.4. Search Strategies’ Comparison:Here is a table that compares the performance measures of each search strategy.3.2. Informed Search- more powerful than uninformed- Informed = use problem-specific knowledge3.2.1. Hill Climbing  Here feedback from the test procedure is used to help the generator decide which direction to move in search space.  The test function is augmented with a heuristic function that provides an estimate of how close a given state is to the goal state.  Computation of heuristic function can be done with negligible amount of computation.  Greedy local searchHill climbing is often used when a good heuristic function is available for evaluating states but whenno other useful knowledge is availablePrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 27
  •  Loop that continuously moves in the direction of increasing value  Terminates when it reaches a ―Peak‖  Problem: depending on initial state, can get stuck in local maxima This simple policy has three well-known drawbacks:1. Local Maxima: a local maximum as opposed to global maximum.2. Plateaus: An area of the search space where evaluation function is flat, thus requiring random walk.3. Ridge: Where there are steep slopes and the search direction is not towards the top but towards the side.Variations of Hill Climbing Stochastic hill-climbing o Random selection among the uphill moves. o The selection probability can vary with the steepness of the uphill move. First-choice hill-climbing o cfr. stochastic hill climbing by generating successors randomly until a better one is found. Random-restart hill-climbing o Tries to avoid getting stuck in local maxima.3.2.2. Best First Search General approach of informed search: o Best-first search: node is selected for expansion based on an evaluation function f(n) Idea: evaluation function measures distance to the goal. o Choose node which appears best Implementation: o fringe is queue sorted in decreasing order of desirability. o Special cases: greedy search, A* search Best First Search is a general search strategy Uses an evaluation function f(n) in deciding which node (in queue) to expand next Note: ―best‖ could be misleading (it is relative, not absolute) Greedy search is one type of Best First Search3.2.2.1.Greedy Search Use a heuristic h() (cost estimate to goal) as the evaluation functionPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 28
  • Example: straight-line distance in finding a path from one city to another Evaluation function f(n) = h(n) (heuristic)= (estimate of cost from n to goal) e.g., hSLD(n) = straight-line distance from n to Bucharest Greedy best-first search expands the node that appears to be closest to goal Complete? No – can get stuck in loops, e.g., Iasi  Neamt  Iasi  Neamt  Time? O(bm), but a good heuristic can give dramatic improvement Space? O(bm) -- keeps all nodes in memory Optimal? No But can be acceptable in practice3.2.2. A* Search Best-known form of best-first search. Idea: avoid expanding paths that are already expensive. Evaluation function f(n)=g(n) + h(n) o g(n) the cost (so far) to reach the node. o h(n) estimated cost to get from the node to the goal. o f(n) estimated total cost of path through n to goal. A* search uses an admissible heuristic o A heuristic is admissible if it never overestimates the cost to reach the goal o Are optimistic Formally: 1. h(n) <= h*(n) where h*(n) is the true cost from n 2. h(n) >= 0 so h(G)=0 for any goal G. e.g. hSLD(n) never overestimates the actual road distanceexample: Find Bucharest starting at Aradf(Arad) = c(??,Arad)+h(Arad)=0+366=366Initial State:Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 29
  • Expand Arrad and determine f(n) for each node f(Sibiu)=c(Arad,Sibiu)+h(Sibiu)=140+253=393 f(Timisoara)=c(Arad,Timisoara)+h(Timisoara)=118+329=447 f(Zerind)=c(Arad,Zerind)+h(Zerind)=75+374=449 Best choice is SibiuAnd so on…Admissible Heuristic A heuristic h(n) is admissible if for every node n, h(n) ≤ h*(n), where h*(n) is the true cost to reach the goal state from n. An admissible heuristic never overestimates the cost to reach the goal, i.e., it is optimistic Example: hSLD(n) (never overestimates the actual road distance) Theorem: If h(n) is admissible, A* using TREE-SEARCH is optimalA* Search Evaluation Completeness: YES Time complexity: (exponential with path length) Space complexity:(all nodes are stored) Optimality: YES  Cannot expand fi+1 until fi is finished.  A* expands all nodes with f(n)< C*  A* expands some nodes with f(n)=C*Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 30
  •  A* expands no nodes with f(n)>C* Also optimally efficient (not including ties)3.2.3. Adversarial SearchMINMAX procedure  Perfect play for deterministic games  Idea: choose move to position with highest minimax value = best achievable payoff against best play  E.g., 2-ply game: MINMAX Algorithm minimax(player,board) if(game over in current board position) return winner children = all legal moves for player from this board if(maxs turn) return maximal score of calling minimax on all the children else (mins turn) return minimal score of calling minimax on all the children Complete? Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(bm) Space complexity? O(bm) (depth-first exploration) For chess, b ≈ 35, m ≈100 for "reasonable" games  exact solution completely infeasibleAlpha Beta Pruning ALPHA-BETA pruning is a method that reduces the number of nodes explored in Minimax strategy. It reduces the time required for the search and it must be restricted so that no time is to be wasted searching moves that are obviously bad for the current player. The exact implementation of alpha-beta keeps track of the best move for each side as it moves throughout the tree.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 31
  • Properties of α-β Pruning does not affect final result Good move ordering improves effectiveness of pruningPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 32
  • With "perfect ordering," time complexity = O(bm/2)  doubles depth of search Why it is called alpha-beta?  A simple example of the value of reasoning about which computations are relevant  α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max  If v is worse than α, max will avoid it  prune that branch  Define β similarly for minChapter 44.1.1 Logics are formal languages for formalizing reasoning, in particular for representinginformation such that conclusions can be drawnLogic involves: – A language with a syntax for specifying what is a legal expression in the language; syntax defines well formed sentences in the language – Semantics for associating elements of the language with elements of some subject matter. Semantics defines the "meaning" of sentences (link to the world); i.e., semantics defines the truth of a sentence with respect to each possible world – Inference rules for manipulating sentences in the language4.1.2. Syntax (grammar, internal structure of the language) – Vocabulary: grammatical categories – Identifying Well-Formed Formulae (―WFFs‖)4.1.3 Semantics (pertaining to meaning and truth value) – Translation – Truth functions – Truth tables for the connectives4.1.4. Connectives (“Sentence-Forming Operators”)~ negation ―not,‖ ―it is not the case that‖⋅ conjunction ―and‖∨ disjunction ―or‖ (inclusive)⊃ conditional ―if – then,‖ ―implies‖≣ biconditional ―if and only if,‖ ―iff‖ • Connect to sentences to make new sentences • Negation attaches to one sentence – It is not raining ∼ RPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 33
  • • Conjunction, disjunction, conditional and biconditional attach two sentences together – It is raining and it is cold R ∙ C – If it rains then it pours R⊃P4.1.5. Well-Formed FormulaeRules for WFF 1. A sentence letter by itself is a WFF A B Z 2. The result of putting immediately in front of a WFF is a WFF A B B (A B) ( C D) 3. The result of putting , , , or between two WFFs and surrounding the whole thing with parentheses is a WFF (A B) ( C D) (( C D) (E (F G))) 4. Outside parentheses may be dropped A B C D ( C D) (E (F G))A sentence that can be constructed by applying the rules for constructing WFFs one at a time is aWFFA sentence which cant be so constructed is not a WFF. – Atomic sentences are wffs: Propositional symbol (atom) Examples: P, Q, R, BlockIsRed, SeasonIsWinter – Complex or compound wffs. Given w1 and w2 wffs: w1 (negation) (w1 w2) (conjunction) (w1 w2) (disjunction) (w1 w2) (implication; w1 is the antecedent; w2 is the consequent) (w1 w2) (biconditional)4.1.6. TautologyIf a wff is True under all the interpretations of its constituents atoms, we say thatthe wff is valid or it is a tautology.Examples: 1 P P 2 (P P) 3 [P (Q P)] 4 [(P Q) P) P]An inconsistent sentence or contradiction is a sentence that is False under all interpretations. Theworld is never like what it describes, as in ―It‘s raining and it‘s not raining.‖4.1.7.ValidityAn argument is valid whenever the truth of all its premises implies the truth of its conclusion.An argument is a sequence of propositions. The final proposition is called the conclusion of the argument while the otherproposition are called the premises or hypotheses of the argument.one can use the rules of inference to show the validity of an argument.Note that p1, p2, … q are generally compound propositions or wffs.4.2.Intelligent agents should have capacity for:Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 34
  •  Perceiving: acquiring information from environment,  Knowledge Representation: representing its understanding of the world,  Reasoning: inferring the implications of what it knows and of the choices it has, and  Acting: choosing what it want to do and carry it out.4.2.1.Knowledge Base  Representation of knowledge and the reasoning processes that brings knowledge to life – center to entire field of AI  Knowledge and reasoning also play a crucial role in dealing partially observable environments  Central component of Knowledge-based agent is its knowledge base.  Knowledge base = set of sentences in a formal language  Declarative approach to building an agent (or other system):  TELL it what it needs to know  Then it can Ask itself what to do - answers should follow from the KB4.2.2.Entailment Entailment means that one thing follows from another: KB ╞ α Knowledge base KB entails sentence α if and only if α is true in all worlds where KB is true o e.g., the KB containing ―the Giants won‖ and ―the Reds won‖ entails ―Either the Giants won or the Reds won‖ o E.g., x+y = 4 entails 4 = x+y o Entailment is a relationship between sentences (i.e., syntax) that is based on semanticsInference Notation :KB ├i α = sentence α can be derived from KB by procedure i Soundness: i is sound if whenever KB ├i α, it is also true that KB╞ α Completeness: i is complete if whenever KB╞ α, it is also true that KB ├i αSound Rules of InferenceHere are some examples of sound rules of inference  A rule is sound if its conclusion is true whenever the premise is trueEach can be shown to be sound using a truth tableRULE PREMISE CONCLUSIONModus Ponens A, A B BAnd Introduction A, B A BAnd Elimination A B ADouble Negation A AUnit Resolution A B, B AResolution A B, B C A CPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 35
  • Soundness of Modus PonensA B A→B OK?True True TrueTrue False FalseFalse True TrueFalse False TrueHorn ClauseA Horn sentence or Horn clause has the form:P1 P2 P3 ... Pn Qor alternatively P1 P2 P3 ... Pn Qwhere Ps and Q are non-negated atoms • To get a proof for Horn sentences, apply Modus Ponens repeatedly until nothing can be done • We will use the Horn clause form later4.2.3.Propositional LogicPropositional Logic Syntax  Propositional logic is the simplest logic – illustrates basic ideas  All objects described are fixed or unique  E.g. "John is a student" student(john) ; Here John refers to one unique person.  In propositional logic (PL) an user defines a set of propositional symbols, like P and Q. User defines the semantics of each of these symbols. For example,  P means "It is hot"  Q means "It is humid―  R means "It is raining"  The proposition symbols: S, S1, S2 etc are sentences _ If S is a sentence, ØS is a sentence (negation ) _ If S1 and S2 are sentences, S1 Ù S2 is a sentence (conjunction ) _ If S1 and S2 are sentences, S1 Ú S2 is a sentence (disjunction ) _ If S1 and S2 are sentences, S1 => S2 is a sentence (implication ) _ If S1 and S2 are sentences, S1  S2 is a sentence (biconditional )Propositional Logic Semantics  Each model specifies true/false for each proposition symbol With these symbols, 8 possible models, can be enumerated automatically. Rules for evaluating truth with respect to a model m: S is true iff S is false S1 S2 is true iff S1 is true and S2 is true S1 S2 is true iff S1is true or S2 is true S1 S2 is true iff S1 is false or S2 is true i.e., is false iff S1 is true and S2 is falsePrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 36
  • S1 S2 is true iff S1 S2 is true and S2 S1 is true  Simple recursive process evaluates an arbitrary sentence, e.g., P1,2 (P2,2 P3,1) = true (true false) = true true = trueTruth Table for ConnectivesValidity and satisfiabilityA sentence is valid if it is true in all models,e.g., True, A A, A A, (A (A B)) BValidity is connected to inference via the Deduction Theorem:KB ╞ α if and only if (KB α) is validA sentence is satisfiable if it is true in some modele.g., A B, CA sentence is unsatisfiable if it is true in no modelse.g., A ASatisfiability is connected to inference via the following:KB ╞ α if and only if (KB α) is unsatisfiableLogical Equivalence  Two sentences are logically equivalent iff true in same models: α ≡ ß iff α╞ β and β╞ αResolution  Conjunctive Normal Form (CNF) o conjunction of disjunctions of literals clauses  E.g., (A Ú ØB) Ù (B Ú ØC Ú ØD)  Resolution is sound and complete for propositional logic  Conversion to CNF B1,1 (P1,2 P2,1)β 1. Eliminate , replacing α β with (α β) (β α). (B1,1 (P1,2 P2,1)) ((P1,2 P2,1) B1,1)Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 37
  • 2. Eliminate , replacing α β with α β. ( B1,1 P1,2 P2,1) ( (P1,2 P2,1) B1,1) 3. Move inwards using de Morgans rules and double-negation: ( B1,1 P1,2 P2,1) (( P1,2 P2,1) B1,1) 4. Apply distributivity law ( over ) and flatten: ( B1,1 P1,2 P2,1) ( P1,2 B1,1) ( P2,1 B1,1)  Resolution Algorithm  Proof by contradiction, i.e., show KB α unsatisfiable  Proportional ResolutionAdvantages of propositional logic: · Simple. · No decidability problems.Limitations of Propositional Calculus  An argument may not be provable using propositional logic, but may be provable using predicate logic.  e.g. All horses are animals. Therefore, the head of a horse is the head of an animal.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 38
  • We know that this argument is correct and yet it cannot be proved under propositional logic, but it can be proved under predicate logic.  Limited representational power.  Simple statements may require large and awkward representations.4.2.4.First Order Predicate Logic (FOPL)Predicate Logic (FOPL) providesi) A language to express assertions (axioms) about certain "worlds ".ii) An inference system or deductive apparatus whereby we may draw conclusions fromsuch assertions andiii) A semantics based on set theory.The language of FOPL consists ofi) A set of constant symbols (to name particular individuals such as table, a,b,c,d,e etc. - these dependon the application)ii) A set of variables (to refer to arbitrary individuals)iii) A set of predicate symbols (to represent relations such as On, Above etc. -these depend on theapplication)iv) A set of function symbols (to represent functions - these depend on the application)v) The logical connectives −, . , υ ,ω , ¬ (to capture and, or, implies, iff and not)vi) The Universal Quantifier, ∀ : and the Existential Quantifer, ∃ :(to capture ―all‖, ―every‖, ―some‖,―few‖, ―there exists‖ etc.)vii) Normally a special binary relation of equality (=) is considered (at least in mathematics) as part ofthe language.QuantificationUniversal Qunatification  <variables> <sentence> Everyone at KEC is smart: x At(x,KEC) Smart(x) x P is true in a model m iff P is true with x being each possible object in the model  Roughly speaking, equivalent to the conjunction of instantiations of P At(KingJohn,KEC) Smart(KingJohn) At(Richard,KEC) Smart(Richard)  Common mistake to avoid:  Typically, is the main connective with  Common mistake: using as the main connective with : x At(x,KEC) Smart(x) means ―Everyone is at KEC and everyone is smartExistential Quantification  <variables> <sentence>  Someone at KEC is smart:  x At(x,KEC) Smart(x)  x P is true in a model m iff P is true with x being some possible object in the model  Typically, is the main connective with  Common mistake: using as the main connective with : x At(x,KEC) Smart(x) is true if there is anyone who is not at KECPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 39
  • Properties of Quantifiers  x y is the same as y x  x y is the same as y x  x y is not the same as y x x y Loves(x,y)  ―There is a person who loves everyone in the world‖  y x Loves(x,y)  ―Everyone in the world is loved by at least one person‖  Quantifier duality: each can be expressed using the other  x Likes(x,IceCream) x Likes(x,IceCream)  x Likes(x,Broccoli) x Likes(x,Broccoli)Example 1For example, Suppose we wish to represent in FOPL the following sentencesa) ―Everyone loves Janet‖b) ―Not everyone loves Daphne‖c) ―Everyone is loved by their mother‖Introducing constant symbols j and d to represent Janet and Daphne respectively; a binarypredicate symbol L to represent loves and the unary function symbol1 m to represent themother of a person given as argument.The above sentences may now be represented in FOPL bya) ∀x.L(x,j)b) ∃x.¬L(x,d)c) ∀x.L(m(x),x)Example 2We will express the following in first order predicate calculus―sam is Kind‖―Every kind person has someone who loves them‖―sam loves someone‖The non-logical symbols of our language arethe constant sam andthe unary predicate (or property) Kind andthe binary predicate Loves.We may represent the above sentences as1. Kind(sam)2. ∀x.(Kind(x) υ ∃y.Loves(y,x))3. ∃y Loves(sam,y)Some Semantic IssuesAn interpretation (of the language of FOPL) consists ofa) a non empty set of objects (the Universe of Discourse, D) containing designatedindividuals named by the constant symbolsb) for each function symbol in the language of FOPL, a corresponding function over D.c) for each predicate symbol in the language of FOPL, a corresponding relation over D.An interpretation is said to be a model for a set of sentences Γ, if each sentence of Γ is trueunder the given interpretation.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 40
  •  The interpretation of a formula F in first order predicate logic consists of fixing a domain of values (non empty) D and of an association of values for every constant, function and predicate in the formula F as follows:  (1) Every constant has an associated value in D.  (2) Every function f, of arity n, is defined by the correspondence where D n = {(x 1 ,..., x n ) | x1 D,..., x n D} n  (3) Every predicate of arity n, is defined by the correspondence P : D {a, f }  Interpretation ExampleUsing FOL  Brothers are siblings x,y Brother(x,y) Sibling(x,y)  Ones mother is ones female parent m,c Mother(c) = m (Female(m) Parent(m,c))  ―Sibling‖ is symmetric x,y Sibling(x,y) Sibling(y,x)  Marcus was a man  Man(Marcus)  Marcus was a Pompeian  Pompeian(Marcus)  All Pompeians were Romans  x:Pompeian(x)Roman(x)  All Romans were either loyal to Caesar or hated him  x:Roman(x) loyalto(x,Caesar) V hate(x, Caesar)  Everyone is loyal to someone  x: y: loyalto(x,y)  People only try to assassinate rulers they are not loyal to x: y: person(x) AND ruler(y) AND tryassassinate(x,y) ~loyalto(x,y)Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 41
  • 4.3Inference RulesComplex deductive arguments can be judged valid or invalid based on whether or not the steps in thatargument follow the nine basic rules of inference. These rules of inference are all relatively simple,although when presented in formal terms they can look overly complex.Conjunction:1. P2. Q3. Therefore, P and Q.1. It is raining in New York.2. It is raining in Boston3. Therefore, it is raining in both New York and BostonSimplification1. P and Q.2. Therefore, P.1. It is raining in both New York and Boston.2. Therefore, it is raining in New York.Addition1. P2. Therefore, P or Q.1. It is raining2. Therefore, either either it is raining or the sun is shining.Absorption1. If P, then Q.2. Therfore, If P then P and Q.1. If it is raining, then I will get wet.2. Therefore, if it is raining, then it is raining and I will get wet.Modus Ponens1. If P then Q.2. P.3. Therefore, Q.1. If it is raining, then I will get wet.2. It is raining.3. Therefore, I will get wet.Modus Tollens1. If P then Q.2. Not Q. (~Q).3. Therefore, not P (~P).Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 42
  • 1. If it had rained this morning, I would have gotten wet.2. I did not get wet.3. Therefore, it did not rain this morning.Hypothetical Syllogism1. If P then Q.2. If Q then R.3. Therefore, if P then R.1. If it rains, then I will get wet.2. If I get wet, then my shirt will be ruined.3. If it rains, then my shirt will be ruined.Disjunctive Syllogism1. Either P or Q.2. Not P (~P).3. Therefore, Q.1. Either it rained or I took a cab to the movies.2. It did not rain.3. Therefore, I took a cab to the movies.Constructive Dilemma1. (If P then Q) and (If R then S).2. P or R.3. Therefore, Q or S.1. If it rains, then I will get wet and if it is sunny, then I will be dry.2. Either it will rain or it will be sunny.3. Therefore, either I will get wet or I will be dry.The above rules of inference, when combined with the rules of replacement, mean that propositionalcalculus is "complete." Propositional calculus is simply another name for formal logicUnificationI in computer science and logic, is an algorithmic process by which one attempts to solvethe satisfiability problem. The goal of unification is to find a substitution which demonstrates that twoseemingly different terms are in fact either identical or just equal. Unification is widely usedin automated reasoning, logic programming and programming language type system implementation.Several kinds of unification are commonly studied: that for theories without any equations (the emptytheory) is referred to as syntactic unification: one wishes to show that (pairs of) terms are identical.If one has a non-empty equational theory, then one is typically interested in showing the equality of (apair of) terms; this is referred to as semantic unification. Since substitutions can be ordered intoa partial order, unification can be understood as the procedure of finding a join on a lattice.We also need some way of binding variables to values in a consistent way so that components ofsentences can be matched. This is the process of Unification.BindingA binding list is a set of enteries of the form v = e where v is a variable and e is an object. Given anexpression p and a binding list we write for the instantiation of p using bindings in.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 43
  • Unifier Given two expressions p and q, a unifier is a binding list such that = . Most General Unifier MGU is a unifier that binds the fewest variables or binds them to less specific expressions. Most General Unifier (MGU) Algorithm for expressions p and q 1. If either p or q is either an object constant or a variable, then: i). If p=q, then p and q already unify and we return { }. ii). If either p or q is a variable, then return the result binding that variable to the other expression. iii). Otherwise return failure. 2.If neither p nor q is an object constant or a variable, then they must both be compound expressions (suppose each is made up ofp1,......pn and q1,......qm) and must be unified one component at a time. i).If the types and any function/relation constant are not equal, return failure. ii).If , then return failure. iii).Otherwise and do the following a).Set = { }, k = 0. b).If k = n then stop and return as the mgu of p and q. c).Otherwise, increment k and apply mgu recursively to and . If and unify, add new bindings to and return to step 2(c)ii. If and fail to unify then return failure for unification of p and q. Resolution Refutation System  Resolution is a technique for proving theorems in predicate calculus  Resolution is a sound inference rule that, when used to produce a refutation, is also complete  In an important practical application resolution theorem proving particularly the resolution refutation system, has made the current generation of Prolog interpreters possible  The resolution principle, describes a way of finding contradictions in a data base of clauses with minimum substitution  Resolution Refutation proves a theorem by negating the statement to be proved and adding the negated goal to the set of axioms that are known or have been assumed to be true  It then uses the resolution rule of inference to show that this leads to a contradiction  Steps in Resolution Refutation Proof 1. Put the premises or axioms into clause form 2. Add the negations of what is to be proved in clause form, to the set of axioms 3. Resolve these clauses together, producing new clauses that logically follow from them 4. Produce a contradiction by generating the empty clause Discussion on Steps  Resolution Refutation proofs require that the axioms and the negation of the goal be placed in a normal form called the clause form Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 44
  •  Clausal form represents the logical database as a set of disjunctions of literals  Resolution is applied to two clauses when one contains a literal and the other its negation  The substitutions used to produce the empty clause are those under which the opposite of the negated goal is true  If these literals contain variables, they must be unified to make them equivalent  A new clause is then produced consisting of the disjunction of all the predicates in the two clauses minus the literal and its negative instance (which are said to have been ―resolved away‖)  Example: We wish to prove that ―Fido will die‖ from the statements that ―Fido is a dog‖ and ―all dogs are animals‖ and ―all animals will die‖ Convert these predicates to clause form Predicate Form Clause Form x: [dog(x)animal(x)] ¬ dog(x) V animal(x) Dog(fido) Dog(fido) y:[animal(y) die(y)] ¬ animal(y) V die(y) Apply ResolutionQ.1. Anyone passing the Artificial Intelligence exam and winning the lottery is happy. But anyonewho studies or is lucky can pass all their exams. Ali did not study be he is lucky. Anyone who is luckywins the lottery. Is Ali happy?Anyone passing the AI Exam and winning the lottery is happy X:[pass(x,AI) Λ win(x, lottery) happy(x)]Anyone who studies or is lucky can pass all their exams X Y [studies(x) V lucky(x) pass(x,y)]Ali did not study but he is lucky¬ study(ali) Λ lucky(ali)Anyone who is lucky wins the lottery X: [lucky(x) win(x,lottery)]Change to clausal form 1. ¬pass(X,AI) V ¬win(X,lottery) V happy(X) 2. ¬study(Y) V pass(Y,Z) 3. ¬lucky(W) V pass(W,V)Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 45
  • 4. ¬study(ali) 5. Lucky(ali) 6. ¬lucky(u) V win(u,lottery) 7. Add negation of the conclusion ¬happy(ali)4.4.Symbolic versus statistical reasoningThe (Symbolic) methods basically represent uncertainty belief as being True, False, or Neither True nor False.Some methods also had problems with Incomplete Knowledge Contradictions in the knowledge.Statistical methods provide a method for representing beliefs that are not certain (or uncertain) but forwhich there may be some supporting (or contradictory) evidence.Statistical methods offer advantages in two broad scenarios:Genuine Randomness -- Card games are a good example. We may not be able to predict any outcomes with certainty but we have knowledge about the likelihood of certain items (e.g. like being dealt an ace) and we can exploit this.Exceptions -- Symbolic methods can represent this. However if the number of exceptions is large such system tend to break down. Many common sense and expert reasoning tasks for example. Statistical techniques can summarise large exceptions without resorting enumeration.Basic Statistical methods -- ProbabilityThe basic approach statistical methods adopt to deal with uncertainty is via the axioms of probability: Probabilities are (real) numbers in the range 0 to 1. A probability of P(A) = 0 indicates total uncertainty in A, P(A) = 1 total certainty and values in between some degree of (un)certainty. Probabilities can be calculated in a number of ways.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 46
  • Very Simply Probability = (number of desired outcomes) / (total number of outcomes) So given a pack of playing cards the probability of being dealt an ace from a full normal deck is 4 (the number of aces) / 52 (number of cards in deck) which is 1/13. Similarly the probability of being dealt a spade suit is 13 / 52 = 1/4. If you have a choice of number of items k from a set of items n then the formula is applied to find the number of ways of making this choice. (! = factorial). So the chance of winning the national lottery (choosing 6 from 49) is to 1. Conditional probability, P(A|B), indicates the probability of of event A given that we know event B has occurred.Bayes Theorem This states: o This reads that given some evidence E then probability that hypothesis is true is equal to the ratio of the probability that E will be true given times the a priori evidence on the probability of and the sum of the probability of E over the set of all hypotheses times the probability of these hypotheses. o The set of all hypotheses must be mutually exclusive and exhaustive. o Thus to find if we examine medical evidence to diagnose an illness. We must know all the prior probabilities of find symptom and also the probability of having an illness based on certain symptoms being observed.Bayesian statistics lie at the heart of most statistical reasoning systems.How is Bayes theorem exploited? The key is to formulate problem correctly: P(A|B) states the probability of A given only Bs evidence. If there is other relevant evidence then it must also be considered.Herein lies a problem: All events must be mutually exclusive. However in real world problems events are not generally unrelated. For example in diagnosing measles, the symptoms of spots and a fever are related. This means that computing the conditional probabilities gets complex. In general if a prior evidence, p and some new observation, N then computing grows exponentially for large sets of p All events must be exhaustive. This means that in order to compute all probabilities the set of possible events must be closed. Thus if new information arises the set must be created afresh and all probabilities recalculated.Thus Simple Bayes rule-based systems are not suitable for uncertain reasoning. Knowledge acquisition is very hard. Too many probabilities needed -- too large a storage space. Computation time is too large. Updating new information is difficult and time consuming. Exceptions like ``none of the above cannot be represented. Humans are not very good probability estimators.However, Bayesian statistics still provide the core to reasoning in many uncertain reasoning systemswith suitable enhancement to overcome the above problems.We will look at three broad categories:Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 47
  • Certainty factors, Dempster-Shafer models, Bayesian networks.Belief Models and Certainty FactorsThis approach has been suggested by Shortliffe and Buchanan and used in their famous medicaldiagnosis MYCIN system.MYCIN is essentially and expert system. Here we only concentrate on the probabilistic reasoningaspects of MYCIN. MYCIN represents knowledge as a set of rules. Associated with each rule is a certainty factor A certainty factor is based on measures of belief B and disbelief D of an hypothesis given evidence E as follows: where is the standard probability. The certainty factor C of some hypothesis given evidenceE is defined as:Reasoning with Certainty factors Rules expressed as if evidence list then there is suggestive evidence with probability, p for symptom . MYCIN uses rules to reason backward to clinical data evidence from its goal of predicting a disease-causing organism. Certainty factors initially supplied by experts changed according to previous formulae. How do we perform reasoning when several rules are chained together? Measures of belief and disbelief given several observations are calculated as follows: How about our belief about several hypotheses taken together? Measures of belief given several hypotheses and to be combined logically are calculated as follows: Disbelief is calculated similarly.Bayesian networksThese are also called Belief Networks or Probabilistic Inference Networks. Initially developed byPearl (1988).The basic idea is: Knowledge in the world is modular -- most events are conditionally independent of most other events. Adopt a model that can use a more local representation to allow interactions between events that only affect each other.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 48
  • Some events may only be unidirectional others may be bidirectional -- make a distinction between these in model. Events may be causal and thus get chained together in a network.Implementation A Bayesian Network is a directed acyclic graph: o A graph where the directions are links which indicate dependencies that exist between nodes. o Nodes represent propositions about events or events themselves. o Conditional probabilities quantify the strength of dependencies.Consider the following example: The probability, that my car wont start. If my car wont start then it is likely that o The battery is flat or o The staring motor is broken.In order to decide whether to fix the car myself or send it to the garage I make the following decision: If the headlights do not work then the battery is likely to be flat so i fix it myself. If the starting motor is defective then send car to garage. If battery and starting motor both gone send car to garage.The network to represent this is as follows:Fig. A simple Bayesian networkReasoning in Bayesian(belief) nets Probabilities in links obey standard conditional probability axioms. Therefore follow links in reaching hypothesis and update beliefs accordingly. A few broad classes of algorithms have been used to help with this: o Pearlss message passing method. o Clique triangulation. o Stochastic methods. o Basically they all take advantage of clusters in the network and use their limits on the influence to constrain the search through net. o They also ensure that probabilities are updated correctly. Since information is local information can be readily added and deleted with minimum effect on the whole network. ONLY affected nodes need updating. Example o Consider problem: ―block-lifting‖ o B: the battery is charged. o L: the block is liftable. o M: the arm moves. o G: the gauge indicates that the battery is chargedPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 49
  • o o p(G,M,B,L) = p(G|M,B,L)p(M|B,L)p(B|L)p(L)= p(G|B)p(M|B,L)p(B)p(L) o Specification:  Traditional: 16 rows  BayessianNetworks: 8 rows Reasoning: top-down o Example: o if the block is liftable, compute the probability of arm moving. o I.e., Compute p(M | L) o Solution: Insert parent nodes: p(M|L) = p(M,B|L) + p(M,¬B|L) Use chain rule: p(M|L) = p(M|B,L)p(B|L) + p(M|,¬B,L)p(¬B|L) Remove independent node: p(B|L) =p(B) : B does not have PARENT p(¬B|L) = p(¬B) = 1 – p(B) p(M|L) = p(M|B,L)p(B) + p(M|,¬B,L)(1 – p(B)) = 0.9´0.95 + 0.0 ´(1 – 0.95) = 0.855 Reasoning: bottom-up Example: If the arm cannot move Compute the probability that the block is not liftable. I.e., Compute: p(¬L|¬M) Use Bayesian Rule: Compute top-down reasoning p(¬M|¬L) = 0.9525 –exercise p(¬L) = 1- p(L) = 1- 0.7 = 0.3Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 50
  • Chapter-5Knowledge Representation.solving complex AI problems requires large amounts of knowledge and mechanisms for manipulatingthat knowledge. The inference mechanisms that operate on knowledge, relay on the ways knowledgeis represented. A good knowledge representation model allows for more powerful inferencemechanisms that operate on them. While representing knowledge one has to consider two things. 1. Facts, which are truths in some relevant world. 2. Representation of facts in some chosen formalism . These are the things which are actuallymanipulated by inference mechanism.Knowledge representation schemes are useful only if there are functions that map facts torepresentations and vice versa. AI is more concerned with a natural language representation of factsand the functions which map natural language sentences into some representational formalism. Anappealing way of representing facts is using the language of logic. Logical formalism provides a wayof deriving new knowledge from the old through mathematical deduction. In this formalism, we canconclude that a new statement is true by proving that it follows from the statements already known tobe facts.STRUCTURED REPRESNTATION OF KNOWLEDGERepresenting knowledge using logical formalism, like predicate logic, has several advantages. Theycan be combined with powerful inference mechanisms like resolution, which makes reasoning withfacts easy. But using logical formalism complex structures of the world, objects and theirrelationships, events, sequences of events etc. can not be described easily.A good system for the representation of structured knowledge in a particular domain should possesthe following four properties:(i) Representational Adequacy:- The ability to represent all kinds of knowledge that are needed in thatdomain.(ii) Inferential Adequacy :- The ability to manipulate the represented structure and infer newstructures.(iii) Inferential Efficiency:- The ability to incorporate additional information into the knowledgestructure that will aid the inference mechanisms.(iv) Acquisitional Efficiency :- The ability to acquire new information easily, either by direct insertionor by program control.The techniques that have been developed in AI systems to accomplish these objectives fall under twocategories:1. Declarative Methods:- In these knowledge is represented as static collection of facts which aremanipulated by general procedures. Here the facts need to be stored only one and they can be used inany number of ways. Facts can be easily added to declarative systems without changing the generalprocedures.2. Procedural Method:- In these knowledge is represented as procedures. Default reasoning andprobabilistic reasoning are examples of procedural methods. In these, heuristic knowledge of ―How todo things efficiently ―can be easily represented.In practice most of the knowledge representation employ a combination of both. Most of theknowledge representation structures have been developed to handle programs that handle naturallanguage input. One of the reasons that knowledge structures are so important is that they provide away to represent information about commonly occurring patterns of things . such descriptions aresome times called schema. One definition of schema isPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 51
  • ―Schema refers to an active organization of the past reactions, or of past experience, which mustalways be supposed to be operating in any well adapted organic response‖.By using schemas, people as well as programs can exploit the fact that the real world is not random.There are several types of schemas that have proved useful in AI programs. They includeApproaches to Knowledge RepresentationWe briefly survey some representation schemes. Simple relational knowledge Inheritable knowledge Inferential knowledge Procedural knowledgeSimple relational knowledgeThe simplest way of storing facts is to use a relational method where each fact about a set of objects isset out systematically in columns. This representation gives little opportunity for inference, but it canbe used as the knowledge basis for inference engines. Simple way to store facts. Each fact about a set of objects is set out systematically in columns (Fig. 7). Little opportunity for inference. Knowledge basis for inference engines. Exist in form of tables, like tables in database _Each relation (row in table) itself can provide very weak inferential capabilities. _Relations may serve as the input to powerful inference enginesFigure: Simple Relational KnowledgeWe can ask things like: Who is dead? Who plays Jazz/Trumpet etc.?This sort of representation is popular in database systems.Inheritable knowledgeRelational knowledge is made up of objects consisting of attributes corresponding associated values.We extend the base more by allowing inference mechanisms: Property inheritance o elements inherit values from being members of a class. o data must be organised into a hierarchy of classes (Fig. 8).Fig. 8 Property Inheritance Hierarchy Boxed nodes -- objects and values of attributes of objects.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 52
  • Values can be objects with attributes and so on. Arrows -- point from object to its value. This structure is known as a slot and filler structure, semantic network or a collection of frames.The algorithm to retrieve a value for an attribute of an instance object: 1. Find the object in the knowledge base 2. If there is a value for the attribute report it 3. Otherwise look for a value of instance if none fail 4. Otherwise go to that node and find a value for the attribute and then report it 5. Otherwise search through using isa until a value is found for the attribute.Inferential knowledge Facts represented in a logical form, which facilitates reasoning. An inference engine is required.Procedural knowledge Coding actions to be performed when a condition satisfied. o Example o IF o  Has fever more than 39 C  Be lazy eating  Skin has red dots o THEN  Suspect petechial fever Implementation: o Writing actions in LISP Programming Language o Writing actions in Production System Framework, like CLISP, JESSIssue in Knowledge RepresentationBelow are listed issues that should be raised when using a knowledge representation technique:Important AttributesPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 53
  • -- Are there any attributes that occur in many different types of problem? There are two instance and isa and each is important because each supports property inheritance.Relationships -- What about the relationship between the attributes of an object, such as, inverses, existence, techniques for reasoning about values and single valued attributes. We can consider an example of an inverse in band(John Zorn,Naked City) This can be treated as John Zorn plays in the band Naked City or John Zorns band is Naked City. Another representation is band = Naked City band-members = John Zorn, Bill Frissell, Fred Frith, Joey Barron,Granularity -- At what level should the knowledge be represented and what are the primitives. Choosing the Granularity of Representation Primitives are fundamental concepts such as holding, seeing, playing and as English is a very rich language with over half a million words it is clear we will find difficulty in deciding upon which words to choose as our primitives in a series of situations. At what level of detail should knowledge be represented? o Balance the trade-off  High-level facts may not be adequate for inference  Low-level primitives may require a lot of storage.How should sets of objects be represented? o By names. o By extensional definition. o By intensional definitionGiven a large amount of knowledge stored, how can relevant parts be accessed? o Selecting an initial structure. o Revising the choice.If Tom feeds a dog then it could become:feeds(tom, dog)If Tom gives the dog a bone like:gives(tom, dog,bone) Are these the same?In any sense does giving an object food constitute feeding?If give(x, food) feed(x) then we are making progress.But we need to add certain inferential rules.In the famous program on relationships Louise is Bills cousin How do we represent this? louise =daughter (brother or sister (father or mother( bill))) Suppose it is Chris then we do not know if itis Chris as a male or female and then son applies as well.Clearly the separate levels of understanding require different levels of primitives and these need manyrules to link together apparently similar primitives.Obviously there is a potential storage problem and the underlying question must be what level ofcomprehension is needed.What are Semantic networks and frames? Explain with suitable examples.A semantic network is often used as a form of knowledge representation. It is a directed graphconsisting of vertices which represent concepts and edges which represent semantic relationsbetween the concepts.The following semantic relations are commonly represented in a semantic net.Meronymy (A is part of B)Holonymy (B has A as a part of itself)Hyponymy (or troponymy) (A is subordinate of B; A is kind of B)Hypernymy (A is superordinate of B)Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 54
  • Synonymy (A denotes the same as B)Antonymy (A denotes the opposite of B)ExampleThe physical attributes of a person can be represented as in the following figure using a semanticnet.These values can also be represented in logic as: isa(person, mammal), instance(Mike-Hall, person)team(Mike-Hall, Cardiff)( For detail of semantic network please follow lecture slides)FramesFrames are descriptions of conceptual individuals. Frames can exist for ``real objects such as ``TheEverest Hotel, sets of objects such as ``Hotels, or more ``abstract objects such as ``Cola-Wars or``Gulfwar.A Frame system is a collection of objects. Each object contains a number of slots. A slot representsan attribute. Each slot has a value. The value of an attribute can be another object. Frames areessentially defined by their relationships with other frames. Relationships between frames arerepresented using slots. If a frame f is in a relationship r to a frame g, then we put the value g in the rslot of f.For example, suppose we are describing the following genealogical tree:The frame describing Adam might look something like:Adam:sex: Malespouse: Bethchild: (Charles Donna Ellen)where sex, spouse, and child are slots.The genealogical tree would then be described by (at least) seven frames, describing the followingindividuals: Adam, Beth, Charles, Donna, Ellen, Male, and Female.A frame can be considered just a convenient way to represent a set of predicates applied to constantPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 55
  • symbols (e.g. ground instances of predicates.). For example, the frame above could be written:sex(Adam,Male)spouse(Adam,Beth)child(Adam,Charles)child(Adam,Donna)child(Adam,Ellen)Frames can also be regarded as an extension to Semantic nets. Indeed it is not clear where thedistinction between a semantic net and a frame ends. Semantic nets initially we used to representlabelled connections between objects. As tasks became more complex the representation needs tobe more structured. The more structured the system it becomes more beneficial to use frames. Aframe is a collection of attributes or slots and associated values that describe some real world entity.Semantic Networks First introduced by Quillian back in the late-60s M. Ross Quillian. "Semantic Memories", In M. M. Minsky, editor, Semantic Information Processing, pages 216-270. Cambridge, MA: MIT Press, 1968 Semantic network is simple representation scheme which uses a graph of labeled nodes and labeled directed arcs to encode knowledge  Nodes – objects, concepts, events  Arcs – relationships between nodes Graphical depiction associated with semantic networks is a big reason for their popularity The idea behind a semantic network is that knowledge is often best understood as a set of concepts that are related to one another. Arcs define binary relations which hold between objects denoted by the nodes. inheritance reasoning in semantic nets  follow MemberOf & SubsetOf links  up the hierarchy  stop at the category with a property link  to infer the property for an individualthe semantic net advantagesPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 56
  •  simplicity of inference  ease of visualizing, even for large nets  ease of representing default values for categories  & ease of overriding defaults by more specific valuesbut, awkward or impossible  to capture many of FOLs representational capabilities  negation, disjunction, existential quantification, ...  when extended to do so, it loses its attractive simplicityFrames  Frames – semantic net with properties  A frame represents an entity as a set of slots (attributes) and associated values  A frame can represent a specific entry, or a general concept  Frames are implicitly associated with one another because the value of a slot can be another frame  3 components of a frame · frame name · attributes (slots) · values (fillers: list of values, range, string, etc.)  Book Frame  Slot  Filler  Title  AI. A modern Approach  Author  Russell & Norvig  Year  2003  More natural support of values then semantic nets  Can be easily implemented using object-oriented programming techniques  Inheritance is easily controlled  Similar to Object-Oriented programming paradigm  Benefits:  Makes programming easier by grouping related knowledge  Easily understood by non-developers  Expressive powerPrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 57
  •  Easy to set up slots for new properties and relations  Easy to include default information and detect missing values  Drawbacks:  No standards  More of a general methodology than a specific representation: • Frame for a class-room will be different for a professor and for a maintenance worker No associated reasoning/inference mechanismsConceptual Dependency (CD)This representation is used in natural language processing in order to represent them earning of thesentences in such a way that inference we can be made from the sentences. It is independent of thelanguage in which the sentences were originally stated. CD representations of a sentence is built outof primitives , which are not words belonging to the language but are conceptual , these primitives arecombined to form the meaning s of the words. As an example consider the event represented by thesentence.In the above representation the symbols have the following meaning:Arrows indicate direction of dependencyDouble arrow indicates two may link between actor and the actionP indicates past tenseATRANS is one of the primitive acts used by the theory . it indicates transfer of possession0 indicates the object case relationR indicates the recipient case relationConceptual dependency provides a str5ucture in which knowledge can be represented and also a set ofbuilding blocks from which representations can be built. A typical set of primitive actions areATRANS - Transfer of an abstract relationship(Eg: give)PTRANS - Transfer of the physical location of an object(Eg: go)PROPEL - Application of physical force to an object (Eg: push)MOVE - Movement of a body part by its owner (eg : kick)GRASP - Grasping of an object by an actor(Eg: throw)INGEST - Ingesting of an object by an animal (Eg: eat)EXPEL - Expulsion of something from the body of an animal (cry)MTRANS - Transfer of mental information(Eg: tell)MBUILD - Building new information out of old(Eg: decide)SPEAK - Production of sounds(Eg: say)ATTEND - Focusing of sense organ toward a stimulus (Eg: listen)A second set of building block is the set of allowable dependencies among the conceptualizationdescribe in a sentence.Advantages of CD: Using these primitives involves fewer inference rules. Many inference rules are already represented in CD structure. The holes in the initial structure help to focus on the points still to be established.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 58
  • Disadvantages of CD: Knowledge must be decomposed into fairly low level primitives. Impossible or difficult to find correct set of primitives. A lot of inference may still be required. Representations can be complex even for relatively simple actions. Consider: Dave bet Frank five pounds that Wales would win the Rugby World Cup. Complex representations require a lot of storageApplications of CD:MARGIE (Meaning Analysis, Response Generation and Inference on English) -- model natural language understanding.SAM (Script Applier Mechanism) -- Scripts to understand stories. See next section.PAM (Plan Applier Mechanism) -- Scripts to understand stories.ScriptA structured representation of background world knowledge. This structure contains knowledgeabout objects, actions, and situations that are described in the input text. If we consider theKnowledge about Shooping or Entering into the Restraunt. This kind of stored Knowledge aboutstereotypical events is called a Script.A script is a structure that prescribes a set of circumstances which could be expected to follow onfrom one another.It is similar to a thought sequence or a chain of situations which could be anticipated.It could be considered to consist of a number of slots or frames but with more specialised roles.Scripts are beneficial because: Events tend to occur in known runs or patterns. Causal relationships between events exist. Entry conditions exist which allow an event to take place Prerequisites exist upon events taking place. E.g. when a student progresses through a degree scheme or when a purchaser buys a house.The components of a script include:Entry Conditions -- these must be satisfied before events in the script can occur.Results -- Conditions that will be true after events in script occur.Props -- Slots representing objects involved in events.Roles -- Persons involved in the events.Track -- Variations on the script. Different tracks may share components of the same script.Scenes -- The sequence of events that occur. Events are represented in conceptual dependency form.Advantages of Scripts: Ability to predict events. A single coherent interpretation may be build up from a collection of observations.Disadvantages: Less general than frames. May not be suitable to represent all kinds of knowledge.Scripts are useful in describing certain situations such as robbing a bank.Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 59
  • Chapter-6Refer pulchowk notes+What is an artificial neural network?An artificial neural network is a system based on the operation of biological neural networks, in otherwords, is an emulation of biological neural system. Why would be necessary the implementation ofartificial neural networks? Although computing these days is truly advanced, there are certain tasksthat a program made for a common microprocessor is unable to perform; even so a softwareimplementation of a neural network can be made with their advantages and disadvantages.Advantages: A neural network can perform tasks that a linear program can not. When an element of the neural network fails, it can continue without any problem by their parallel nature. A neural network learns and does not need to be reprogrammed. It can be implemented in any application. It can be implemented without any problem.Disadvantages:Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 60
  • The neural network needs training to operate. The architecture of a neural network is different from the architecture of microprocessors therefore needs to be emulated. Requires high processing time for large neural networks.Expert System ArchitectureExplanation Based LearningThe EBL HypothesisBy understanding why an example is a member of a concept, can learn the essential properties of theconceptTrade-offthe need to collect many examples for the ability to ―explain‖ single examples (a ―domain‖ theory)Learning by Generalizing ExplanationsGiven– Goal (e.g., some predicate calculus statement)– Situation Description (facts)– Domain Theory (inference rules)– Operationality CriterionUse problem solver to justify, using the rules, the goal in terms of the facts.Generalize the justification as much as possible.The operationality criterion states which other terms can appear in the generalized result.• An explanation is an inter-connected collection of ―pieces‖ of knowledge (inference rules, rewriterules, etc.)• These ―rules‖ are connected using unification, as in Prolog• The generalization task is to compute the most general unifier that allows the ―knowledge pieces‖to be connected together as generally as possiblePrepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 61
  • Machine Vision The goal of Machine Vision is to create a model of the real world from images – A machine vision system recovers useful information about a scene from its two dimensional projectionsThe world is three dimensional – Two dimensional digitized images Knowledge about the objects (regions) in a scene and projection geometry is required. The information which is recovered differs depending on the application – Satellite, medical images etc. Processing takes place in stages: – Enhancement, segmentation, image analysis and matching (pattern recognition).Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 62
  • Machine Vision StagesBoltzmann MachineA Boltzmann machine is a type of stochastic recurrent neural network invented by GeoffreyHinton and Terry Sejnowski. Boltzmann machines can be seen asthe stochastic, generative counterpart of Hopfield nets. They were one of the first examples of aneural network capable of learning internal representations, and are able to represent and (givensufficient time) solve difficult combinatoric problems. However, due to a number of issues discussedbelow, Boltzmann machines with unconstrained connectivity have not proven useful for practicalproblems in machine learning or inference. They are still theoretically intriguing, however, due to thelocality and Hebbian nature of their training algorithm, as well as their parallelism and theresemblance of their dynamics to simple physical processes. If the connectivity is constrained, thelearning can be made efficient enough to be useful for practical problems.They are named after the Boltzmann distribution in statistical mechanics, which is used in theirsampling function.A Boltzmann machine, like a Hopfield network, is a network of units with an "energy" defined for the network.It also has binary units, but unlike Hopfield nets, Boltzmann machine units are stochastic. The globalenergy, , in a Boltzmann machine is identical in form to that of a Hopfield network: Where:  is the connection strength between unit and unit .  is the state, , of unit .  is the threshold of unit . The connections in a Boltzmann machine have two restrictions:  . (No unit has a connection with itself.)  . (All connections are symmetric.)Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 63