Your SlideShare is downloading. ×
Artificial intelligence
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Artificial intelligence

331
views

Published on

Basic to AI, coignitive science. …

Basic to AI, coignitive science.
A subject to Purbanchal university,Nepal

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
331
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Objectives of Course (Artificial Intelligence) – Understand the definition of artificial intelligence – Machine Learning – Natural Language – Expert Systems – Neural Network – Have a fair idea of the types of problems that can be currently solved by computers and those that are as yet beyond its ability. Introduction to AI Types of AI tasks: One possible classification of AI task is into 3 classes: Mundane tasks, Formal tasks and Experts tasks. 1. Mundane tasks: by mundane tasks, all those tasks which nearly all of us can do routinely in order to act and interact in the world. This includes: perception, vision, speech, natural language (understanding, generation and translation), common sense reasoning, and robot control. 2. Formal tasks: a. Games: Chess, backgammon, GO etc. To solve these problems we must explore a large number of solutions quickly and choose the best one. b. Mathematics: i. Geometry and logic theory: it proved mathematical theorems. It actually proved several theorems form classical math textbooks. ii. Integral calculus: programs such as Mathematica and Mathcad and perform complicated symbolic integration and differentiation. c. Proving properties of programs. E.g. correctness, manipulate symbols and reduce problem. 3. Expert task: by expert tasks means things that only some of people are good and which acquire extensive training. This includes: a. Engineering: Design, Fault finding, Manufacturing b. Planning c. Scientific analysis d. Medical diagnosis e. Financial analysis What is AI? AI is one of the newest disciplines, formally initiated in 1956 by McCarthy when the name was coined. The advent of computers made it possible for the first time for people to test models they proposed for learning, reasoning, perceiving etc. Definition may be organized into four categories: 1. Systems that thinks like humans 2. Systems that act like humans 3. Systems that think rationally 4. Systems that act rationally 1. Systems that thinks like humans: This requires ―getting inside‖ of the human mind to see how it works and then comparing our computer programs to this. This is what cognitive science attempts to do. Another way to do this is to observe a human problem solving and argue that one’s programs go about problem solving in a similar way. 1
  • 2. For example, General Problem Solver (GPS) was an early computer program that attempted to model human thinking. The developers were not so interested in whether or not GPS solved problems correctly. They were more interested in showing that it solved problems like people, going through the same steps and taking around the same amount of time to perform those steps. 2. Systems that act like humans: The first proposal for success in building a program and acts humanly was the Turing Test. To be considered intelligent a program must be able to act sufficiently like a human to fool an interrogator. The machine and the human are isolated from the person carrying out the test and messages are exchanged via a keyboard and screen. If the person cannot distinguish between the computer and the human being, then the computer must be intelligent. To pass this test requires: NLP (natural language processing), knowledge representation, automated reasoning, machine learning. A total Turing test also requires computer vision and robotics. 3. Systems that think rationally: Aristotle was one of the first to attempt to codify ―thinking‖. For example, all computers use energy. Using energy always generates heat. Therefore, all computers generate heat. This initiates the field of logic. Formal logic was developed in the late nineteenth century. This was the first step toward enabling computer programs to reason logically. By 1965, programs existed that could, given enough time and memory, take a description of the problem in logical notation and find the solution, if one existed. 4. Systems that act rationally: Acting rationally means acting so as to achieve one’s goals, given one’s beliefs. An agent is just something that perceives and acts. In the logical approach to AI, the emphasis is on correct inferences. This is often part of being a rational agent because one way to act rationally is to reason logically and then act on ones conclusions. Foundation of AI:  Philosophy: Logic, methods of reasoning, mind as physical system foundations of learning, language, rationality.  Mathematics: Formal representation and proof algorithms, computation, (un) decidability, (in) tractability, probability. Philosophers staked out most of the important ideas of AI, but to move to a formal science requires a level of mathematical formalism in three main areas: computation, logic and probability. Mathematicians proved that there exist5s an algorithm to prove any true statement in firstorder logic. Analogously, Turing showed that there are some functions that no Turing machine can compute. Although un-decidability and non-computability are important in the understanding of computation, the notion of intractability has had much greater impact on computer science and AI. A class of problems in called intractable if the time required to solve instances of the class grows at least exponentially with the size of the instances.  Economics: utility, decision theory.  Neuroscience: physical substrate for mental activity.  Psychology: phenomena of perception and motor control, experimental techniques. The principle characteristic of cognitive psychology is that the brain processes and processes information. The claim is that beliefs, goals, and reasoning steps can be useful components of a theory of human behavior. The knowledge-based agent has three key steps: o Stimulus is translated into an internal representation. o The representation is manipulated by cognitive processes to derive new internal representations o These are translated into actions. 2
  • 3.    Computer engineering: building fast computers. Control theory: design systems that maximize an objective function over time. Linguistics: knowledge representation, grammar. Having a theory of how humans successfully process natural language is an AI-complete problem- if we could solve this problem then we would have created a model of intelligence. AI History: Intellectual roots of AI date back to the early studies of the nature of knowledge and reasoning. The dream of making a computer imitate humans also has a very early history. The concept of intelligent machines is found in Greek mythology. There is a story in the 8 th century A.D about Pygmalion Olio, the legendary king of Cyprus. Aristotle (384-322 BC) developed an informal system of syllogistic logic, which is the basis of the first formal deductive reasoning system. Early in the 17th century, Descartes proposed that bodies of animals are nothing more than complex machines. Pascal in 1642 made the first mechanical digital calculating machine. In 1943: McCulloch and Pits propose modeling neurons using on/off devices. In 1950’s: Claude Shannon and Alan Turing try to write chess playing programs. In 57: John McCarthy thinks of the name ―Artificial Intelligence‖. In 1960’s: Logic Theorist, GPS (General Problem Solver), micro worlds, neural networks. In 1971: NP-Completeness theory casts doubt on general applicability of AI methods. In 1970’s: Knowledge based system and Expert systems were developed. In 1980’s: AI techniques in widespread use, neural networks rediscovered. The early AI systems used general systems, little knowledge. AI researchers realized that specialized knowledge is required for rich tasks to focus reasoning. The 1990's saw major advances in all areas of AI including the following: • Machine learning, data mining • Intelligent tutoring, • Case-based reasoning, • Multi-agent planning, scheduling, • Uncertain reasoning, • Natural language understanding and translation, • Vision, virtual reality, games, and other topics. In 2000, the Nomad robot explores remote regions of Antarctica looking for meteorite samples. Limits of AI Today Today’s successful AI systems operate in well-defined domains and employ narrow, specialized knowledge. Common sense knowledge is needed to function in complex, open-ended worlds. Such a system also needs to understand unconstrained natural language. However these capabilities are not yet fully present in today’s intelligent systems. What can AI systems do? Today’s AI systems have been able to achieve limited success in some of these tasks. • In Computer vision, the systems are capable of face recognition 3
  • 4. • In Robotics, we have been able to make vehicles that are mostly autonomous. • In Natural language processing, we have systems that are capable of simple machine translation. • Today’s Expert systems can carry out medical diagnosis in a narrow domain • Speech understanding systems are capable of recognizing several thousand words continuous speech • Planning and scheduling systems had been employed in scheduling experiments with the Hubble Telescope. • The Learning systems are capable of doing text categorization into about a 1000 topics • In Games, AI systems can play at the Grand Master level in chess (world champion), checkers, etc. What can AI systems NOT do yet? • Understand natural language robustly (e.g., read and understand articles in a newspaper) • Surf the web • Interpret an arbitrary visual scene • Learn a natural language • Construct plans in dynamic real-time domains • Exhibit true autonomy and intelligence Goals in problem solving: Goal schemas: To build a system to solve a particular problem, we need to do four things: 1. Define the problem precisely. This definition must include precise specifications of what the initial situations will be as well as what final situations constitute acceptable solutions to the problem. 2. Analyze the problem. A few very important features can have an immense impact on the appropriateness of various possible techniques for solving the problem. 3. Isolate and represent the task knowledge that is necessary to solve the problem. 4. Choose the best problem-solving techniques and apply them to the particular problem. Search Problem It is characterized by an initial state and a goal-state description. The guesses are called the operators where a single operator transforms a state into another state which is expected to be closer to a goal state. Here the objective may be to find a goal state or to find a sequence of operators to a goal state. Additionally, the problem may require finding just any solution or an optimum solution. Constraint Satisfaction (Problem solving): Here, the main objective is to discover some problem state, which satisfies a given set of constraints. By viewing problem as one of constraint satisfaction, substantial amount of search can be possibly reduced as compared to a method that attempts to form a partial solution directly by choosing specific values for components of the eventual solution. Constraint satisfaction is a search procedure that operates in a space of constraint sets. The initial state contains the constraints that are originally given in the problem description. A goal state is any state that has been constrained ―enough‖, where ―enough‖ must be defined for each problem. Constraint satisfaction is a two-step process; first, constraints are discovered and propagated as far as possible throughout the system. Then if there is still not a solution, search begins. A guess about something is made and added as a new constraint. Propagation can then occur with this new constraint and so forth. It is fairly easy to see that a CSP can be given an incremental formulation as a standard search problem as follows:  Initial state: the empty assignment, in which all variables are unassigned.  Successor function: a value can be assigned to any unassigned variable, provided that it does not conflict with previously assigned variables. 4
  • 5.   Goal test: the current assignment is complete. Path cost: a constant cost (e.g., 1) for every step  CSP?  Finite set of variables V1, V2, …, Vn  Nonempty domain of possible values for each variable DV1, DV2, … DVn  Finite set of constraints C1, C2, …, Cm  Each constraint Ci limits the values that variables can take, e.g., V1 ≠ V2  A state is defined as an assignment of values to some or all variables. Consistent assignment  assignment does not violate the constraints CSP benefits  Standard representation pattern  Generic goal and successor functions  Generic heuristics (no domain specific expertise). An assignment is complete when every variable is mentioned. A solution to a CSP is a complete assignment that satisfies all constraints. Some CSPs require a solution that maximizes an objective function. Examples of Applications:  Scheduling the time of observations on the Hubble Space Telescope  Airline schedules  Cryptography  Computer vision -> image interpretation  Scheduling your MS or PhD thesis exam  Map Coloring        Problem: F O R T Y + T E N + T E N = S I X T Y Solution: (digit 0) 2N = 0 (mod 10) (digit 1) 2E = 0 (mod 10) no carry from digit 0 possible Therefore N=0 and E=5. Then O=9 and I=1 requiring two carries. Further S=F+1. Digit 2 no gives the equation R + 2T + 1 = 20 + X The smallest digit left is 2. So X>=2 Therefore R + 2T >= 21. As not both R and T can be 7 one of then must be larger that is 8. We are left with some case checkings. Case R=8: Then T>=6.5 that is T=7 and X=3. There are no consecutive numbers left for F and S. Case T=8: Then R>=5 and as 5 is in use R>=6. Case R=6: As X=3 there are no consecutive numbers left. Case R=7: As X=4 we get F=2 and S=3. The remaining digit 6 will be Y. Therefore we get a unique solution. 29786 850 850 ----- 5
  • 6. 31486 Planning: The purpose of planning is to find a sequence of actions that achieves a given goal when performed starting in a given state. In other words, given a set of operator instances (defining the possible primitive actions by the agent), an initial state description, and a goal state description or predicate, the planning agent computes a plan. Simple Planning Agent: Earlier we saw that problem-solving agents are able to plan ahead - to consider the consequences of sequences of actions - before acting. We also saw that a knowledge-based agents can select actions based on explicit, logical representations of the current state and the effects of actions. This allows the agent to succeed in complex, inaccessible environments that are too difficult for a problem-solving agent. Problem Solving Agents + Knowledge-based Agents = Planning Agents In this module, we put these two ideas together to build planning agents. At the most abstract level, the task of planning is the same as problem solving. Planning can be viewed as a type of problem solving in which the agent uses beliefs about actions and their consequences to search for a solution over the more abstract space of plans What is a plan? A sequence of operator instances, such that "executing" them in the initial state will change the world to a state satisfying the goal state description. Goals are usually specified as a conjunction of goals to be achieved. Properties of planning algorithm: Soundness:  A planning algorithm is sound if all solutions found are legal plans o All preconditions and goals are satisfied. o No constraints are violated (temporal, variable binding) Completeness:  A planning algorithm is complete if a solution can be found whenever one actually exists.  A planning algorithm is strictly complete if all solutions are included in the search space. Optimality:  A planning algorithm is optimal if the order in which solution are found is consistent with some measure of plan quality. 6
  • 7. Linear planning: Basic idea: work on one goal until completely solved before moving on to the next goal. Planning algorithm maintains goal stack. Implications: - No interleaving of goal achievement - Efficient search if goals do not interact (much) Advantages: - Reduced search space, since goals are solved one at a time - Advantageous if goals are (mainly) independent - Linear planning is sound Disadvantages: - Linear planning ma produce suboptimal solutions (based on the number of operators in the plan) - Linear planning is incomplete. Sub optimal plans: Result of linearity, goal interactions and poor goal ordering Initial state: at(obj1, locA), at(obj2, locA), at(747, locA) Goals: at(obj1, locB), at(obj2, locB) Plan: [load(obj1, 747, locA); fly(747, locA, locB); unload(obj1, 747, locB); fly(747, locB, locA); Load(obj2, 747, locA); fly(747, locA, locB); unload(obj2, 747, locB)] Concept of non-linear planning: Use goal set instead of goal stack. Include in the search space all possible sub goal orderings. Handles goal interactions by interleaving. Advantages: - Non-linear planning is sound. - Non-linear planning is complete. - Non-linear planning may be optimal with respect to plan length (depending on search strategy employed) Disadvantages: - Larger search space, since all possible goal orderings may have to be considered. - Somewhat more complex algorithm; more bookkeeping. Non-linear planning algorithm:  NLP (initial-state, goals) - State = initial state, plan=[]; goalset = goals, opstack = [] - Repeat until goalset is empty  Choose a goal g from the goalset  If g does not match state, then  Choose an operator o whose add-list matches goal g  Push o on the opstack 7
  • 8.   While state    Add the preconditions of o to the goalset all preconditions of operator on top of opstack are met in Pop operator 0 from top of opstack State = apply(0 , state) Plan = [plan, 0] Means-Ends Analysis The means-ends analysis concentrates around the detection of differences between the current state and the goal state. Once such difference is isolated, an operator that can reduce the difference must be found. However, perhaps that operator cannot be applied to the current state. Hence, we set up a subproblem of getting to a state in which it can be applied. The kind of backward chaining in which the operators are selected and then sub goals are setup to establish the preconditions of the operators is known as operator subgoaling. Just like the other problem-solving techniques, means -ends analysis relies on a set of rules that can transform one problem state into another. However, these rules usually are not represented with complete state descriptions on each side. Instead, they are represented as left side, which describes the conditions that must be met for the rule to be applicable and a right side, which describes those aspects of the problem state that will be changed by the application of the rule. A separate data structure called a difference table indexes the rules by the differences that they can be used to reduce. Algorithm: Means-Ends Analysis 1. Compare CURRENT to GOAL. If there are no differences between them, then return. 2. Otherwise, select the most important difference and reduce it by doing the following until success or failure is signaled: a) Select a new operator O, which is applicable to the current difference. If there are no such operators, then signal failure. b) Apply O to CURRENT. Generate descriptions of two states: O-START, a state in which O’s preconditions are satisfied and O-RESULT, the state that would result if O were applied in O-START. c) If (FIRST-PART MEA (CURRENT, O-START)) and (LAST-PART MEA (O-RESULT, GOAL)) are successful, then signal success and return the result of concatenating FIRST-PART, O and LAST-PART. Production rules Systems: Since search is a very important process in the solution of hard problems for which no more direct techniques are available, it is useful to structure AI programs in a way that enables describing and performing the search process. Production systems provide such structures. A production system consists of: · A set of rules, each consisting of a left side that determines the applicability of the rule and a right side that describes the operation to be performed if the rule is applied. · One or more knowledge or databases that contain whatever information is appropriate for the particular task. · A control strategy that specifies the order in which the rules will be compared to the database and a way of resolving the conflicts that arise when several rules match at once. · A rule applier. Characteristics of Control Strategy · The first requirement of a good control strategy is that it causes motion. Control strategies that do not cause motion will never lead to a solution. For it, on each cycle, choose at random from among the applicable rules. It will lead to a solution eventually. 8
  • 9. · The second requirement of a good control strategy is that it be systematic. Despite the control strategy causes motion, it is likely to arrive at the same state several times during the search process and to use many more unnecessary steps. Systematic control strategy is necessary for the global motion i.e. over the course of several steps as well as local steps i.e. over the course of single step. Combinatorial explosion: is the phenomenon of the time required to find an optimal schedule (solution) being increased exponentially. In other words, when the time required solving the problem takes more than the estimated time, the phenomenon is known as combinatorial explosion. Heuristic Function: is a function that maps from problem state descriptions to measures of desirability, usually represented as numbers. Well-designed heuristic functions can play an important part in efficiently guiding a search process toward a solution. The purpose of the heuristic function is to guide the search process in the most profitable direction by suggesting which path to follow first when more than one is available. The more accurately the heuristic function estimates the correct direction of each node in the search tree, the more direct the solution process. In the extreme, the heuristic function would be so good that essentially no search would be required, the system will move directly to a solution. Problem Characteristics: In order to choose the most appropriate method for a particular problem, it is necessary to analyze the problem of several key dimensions: · Is the problem decomposable into a set of independent smaller or easier sub problems? · Can solution steps be ignored or at least undone if they prove unwise? · Is the problem’s universe predictable? · Is a good solution to the problem obvious without comparison to all other possible solutions? · Is the desired solution a state of the world or a path to a state? · Is a large amount of knowledge absolutely required to solve the problem or is knowledge important only to constrain the search? · Can a computer that is simply given the problem return the solution or will the solution of the problem require interaction between the computer and a person? Forward chaining systems: In a forward chaining system the facts in the system are represented in a working memory which is continually updated. Rules in the system represent possible actions to take when specified conditions hold on items in the working memory – they are sometimes called condition-action rules. The conditions are usually patterns that must match items in the working memory, while the actions usually involve adding or deleting items from the working memory. The interpreter controls the application of the rules, given the working memory, thus controlling the system’s activity. It is based on a cycle of activity sometimes known as a recognize-act cycle. The system first checks to find all the rules whose conditions hold, given the current state of working memory. It then selects one and performs the actions in the action part of the rule. The actions will result in a new working memory, and the cycle begins again. This cycle will be repeated until either no rules fire, or some specified goal state is satisfied. Backward chaining systems: 9
  • 10. So far we have looked at how rule-based systems can be used to draw new conclusions from existing data, adding these conclusions to a working memory. This approach is most useful when you know all the initial facts, but don’t have much idea what the conclusion might be. If we do know what the conclusion might be, or have some specific hypothesis to test, forward chaining systems may be inefficient. We could keep on forward chaining until no more rules apply or you have added your hypothesis to the working memory. But in the process the system is likely to do a lot of irrelevant work, adding uninteresting conclusions to working memory. Forward and backward chaining: The restriction to just one positive literal may seem somewhat arbitrary and uninteresting, but it is actually very important for three reasons: 1. Every Horn clause can be written as an implication whose premise is a conjunction of positive literals and whose conclusion is a single positive literal. 2. Inference with Horn clauses can be done through the forward chaining and backward chaining algorithms, which we explain next. Both of these algorithms are very natural, in that the inference steps are obvious and easy to follow for humans. 3. Deciding entailment with Horn clauses can be done in time that is linear in the size of the knowledge base. This last fact is a pleasant surprise. It means that logical inference is very cheap for many propositional knowledge bases that are encountered in practice. Forward chaining is an example of the general concept of data-driven reasoning-that is, reasoning in which the focus of attention starts with the known data. It can be used within an agent to derive conclusions from incoming percepts, often without a specific query in mind. For example, the wumpus agent might T ELL its percepts to the knowledge base using an incremental forward-chaining algorithm in which new facts can be added to the agenda to initiate new inferences. In humans, a certain amount of data-driven reasoning occurs as new information arrives. P Q LɅM P LɅB M AɅB L AɅP L A B Figure. Simple knowledge base of horn clauses and its corresponding graph (AND … OR) The backward-chaining algorithm, as its name suggests, works backwards from the query. If the query q is known to be true, then no work is needed. Otherwise, the algorithm finds those implications in the knowledge base that conclude q. If all the premises of one of those implications can be proved true (by backward chaining), then q is true. When applied to the query Q in Figure, it works back down the graph until it reaches a set of known facts that forms the basis for a proof. Backward chaining is a form of goal-directed reasoning. It is useful for answering specific questions such as "What shall I do now?" and "Where are my keys?" Often, the cost of backward chaining is much less than linear in the size of the knowledge base, because the process touches only relevant facts. In general, an agent should share the work between forward and backward reasoning, limiting forward reasoning to the generation of facts that are likely to be relevant to queries that will be solved by backward chaining. 10
  • 11. MyCIN Style probability and its application In artificial intelligence, MYCIN was an early expert system designed to identify bacteria causing severe infections, such as bacteremia and meningitis, and to recommend antibiotics, with the dosage adjusted for patient's body weight — the name derived from the antibiotics themselves, as many antibiotics have the suffix "-mycin". The Mycin system was also used for the diagnosis of blood clotting diseases. MYCIN was developed over five or six years in the early 1970s at Stanford University in Lisp by Edward Shortliffe. MYCIN was never actually used in practice but research indicated that it proposed an acceptable therapy in about 69% of cases, which was better than the performance of infectious disease experts who were judged using the same criteria. MYCIN operated using a fairly simple inference engine, and a knowledge base of ~600 rules. It would query the physician running the program via a long series of simple yes/no or textual questions. At the end, it provided a list of possible culprit bacteria ranked from high to low based on the probability of each diagnosis, its confidence in each diagnosis' probability, the reasoning behind each diagnosis, and its recommended course of drug treatment. MYCIN was based on certainty factors rather than probabilities. These certainty factors CF are in the range [– 1,+1] where –1 means certainly false and +1 means certainly true. The system was based on rules of the form: IF: the patient has signs and symptoms s1 s2 … sn , and certain background conditions t1 t2 … tm hold THEN: conclude that the patient had disease di with certainty CR The idea was to use production rules of this kind in an attempt to approximate the calculation of the conditional probabilities p( di | s1 s2 … sn), and provide a scheme for accumulating evidence that approximated the reasoning process of an expert. Practical use / Application: MYCIN was never actually used in practice. This wasn't because of any weakness in its performance. As mentioned, in tests it outperformed members of the Stanford medical school faculty. Some observers raised ethical and legal issues related to the use of computers in medicine — if a program gives the wrong diagnosis or recommends the wrong therapy, who should be held responsible? However, the greatest problem, and the reason that MYCIN was not used in routine practice, was the state of technologies for system integration, especially at the time it was developed. MYCIN was a stand-alone system that required a user to enter all relevant information about a patient by typing in response to questions that MYCIN would pose. 11
  • 12. Chapter 2: Intelligence Introduction to intelligence: Definition of Artificial Intelligence is concerned with the design of intelligence in an artificial device. The term was coined by McCarthy in 1956. There are two ideas in the definition. 1. Intelligence 2. Artificial device What is intelligence?  Is it that which characterize humans? Or is there an absolute standard of judgment?  Accordingly there are two possibilities: o A system with intelligence is expected to behave as intelligently as a human o A system with intelligence is expected to behave in the best possible manner  Secondly what type of behavior are we talking about?  Are we looking at the thought process or reasoning ability of the system?  Or are we only interested in the final manifestations of the system in terms of its actions? Given this scenario different interpretations have been used by different researchers as defining the scope and view of Artificial Intelligence. 1. One view is that artificial intelligence is about designing systems that are as intelligent as humans. This view involves trying to understand human thought and an effort to build machines that emulate the human thought process. This view is the cognitive science approach to AI. 2. The second approach is best embodied by the concept of the Turing Test. Turing held that in future computers can be programmed to acquire abilities rivaling human intelligence. As part of his argument Turing put forward the idea of an 'imitation game', in which a human being and a computer would be interrogated under conditions where the interrogator would not know which was which, the communication being entirely by textual messages. Turing argued that if the interrogator could not distinguish them by questioning, then it would be unreasonable not to call the computer intelligent. Turing's 'imitation game' is now usually called 'the Turing test' for intelligence. Common sense reasoning: Common sense is ability to analyze a situation based on its context, using millions of integrated pieces of common knowledge. Ability to use common sense knowledge depends on being able to do commonsense reasoning. Commonsense reasoning is a central part of intelligent behavior. Example: everyone knows that dropping a glass of water, the glass will break and water will spill. However, this information is not obtained by formula or equation for a falling body or equations governing fluid flow. Common sense knowledge: means what everyone knows. Example:  Every person is younger than the person’s mother.  People do not like being repeatedly interrupted.  If you hold a knife by its blade then the blade may cut you.  If you drop paper into a flame then the paper will burn  You start getting hungry again a few hours after eating a meal.  People generally sleep at night. Common sense reasoning: ability to use common sense knowledge. Example:  If you have a problem, think of a past situation where you solved a similar problem. 12
  • 13.   If you fail at something, imagine how you might have done things differently. If you observe an event, try to infer what prior event might have caused it.    The template is a frame with slots and slots fillers The template is fed to a script classifier, which classifies what script is active in the template. The template and the script are passed to a reasoning problem builder specific to the script, which converts the template into a commonsense reasoning problem. The problem and a commonsense knowledge base are passed to a commonsense reasoner. It infers and fills in missing details to produce a model of the input text. The model provides a deeper representation of the input, than is provided by the template alone.   Agents: An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators • Human agent: eyes, ears, and other organs for sensors; hands, legs, mouth, and other body parts for actuators • Robotic agent: cameras and infrared range finders for sensors; various motors for actuators Agents and environments • The agent program runs on the physical architecture to produce f • agent = architecture + program 13
  • 14. Rational agents: An agent should strive to "do the right thing", based on what it can perceive and the actions it can perform. The right action is the one that will cause the agent to be most successful • Performance measure: An objective criterion for success of an agent's behavior • E.g., performance measure of a vacuum-cleaner agent could be amount of dirt cleaned up, amount of time taken, amount of electricity consumed, amount of noise generated, etc. Rational Agent: For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever builtin knowledge the agent has. Rationality is distinct from omniscience (all knowing with infinite knowledge) • Agents can perform actions in order to modify future percepts so as to obtain useful information (information gathering, exploration) • An agent is autonomous if its behavior is determined by its own experience (with ability to learn and adapt) PEAS: Performance measure, Environment, Actuators, Sensors • Must first specify the setting for intelligent agent design • Consider, e.g., the task of designing an automated taxi driver: – Performance measure, Environment, Actuators and Sensors. Must first specify the setting for intelligent agent design • Consider, e.g., the task of designing an automated taxi driver: – Performance measure: Safe, fast, legal, comfortable trip, maximize profits – Environment: Roads, other traffic, pedestrians, customers – Actuators: Steering wheel, accelerator, brake, signal, horn – Sensors: Cameras, sonar, speedometer, GPS, odometer, engine sensors, keyboard Agent: Medical diagnosis system • Performance measure: Healthy patient, minimize costs, lawsuits • Environment: Patient, hospital, staff • Actuators: Screen display (questions, tests, diagnoses, treatments, referrals) • Sensors: Keyboard (entry of symptoms, findings, patient's answers) Agent: Part-picking robot • Performance measure: Percentage of parts in correct bins • Environment: Conveyor belt with parts, bins • Actuators: Jointed arm and hand • Sensors: Camera, joint angle sensors Environment types: Fully observable (vs. partially observable): An agent's sensors give it access to the complete state of the environment at each point in time. • Deterministic (vs. stochastic): The next state of the environment is completely determined by the current state and the action executed by the agent. (If the environment is deterministic except for the actions of other agents, then the environment is strategic) • Episodic (vs. sequential): The agent's experience is divided into atomic "episodes" (each episode consists of the agent perceiving and then performing a single action), and the choice of action in each episode depends only on the episode itself. Static (vs. dynamic): The environment is unchanged while an agent is deliberating. (The environment is semidynamic if the environment itself does not change with the passage of time but the agent's performance score does) • Discrete (vs. continuous): A limited number of distinct, clearly defined percepts and actions. 14
  • 15. • Single agent (vs. multi-agent): An agent operating by itself in an environment. Agent types: Four basic types in order of increasing generality: • Simple reflex agents • Model-based reflex agents • Goal-based agents • Utility-based agents Simple reflex agents: The visual input from a simple camera comes in at the rate of 50 mbps, so the lookup table for an hour would be 260X60X50M.. However, we can summarize certain portions of the table by noting certain commonly occurring I/O associations. For example, if the car in front brakes, then the driver should also brake. In other words some processing is done on visual input to establish the condition, ―Brake lights in front are on‖ and this triggers some established connection to the action ―start breaking‖. Such a connection is called a condition-action rule written as: If condition then action. Fig. simplex reflex agent. Model-based reflex agents: Fig. model based reflex agent. Simplex reflex agent only works if the correct action can be chosen based only on the current percept. Even for the simple braking rule above, we need some sort of internal description of the world state. To determine if 15
  • 16. the car in front is braking, we would probably need to compare the current image with the previous to see if the brake light has come on. For example, from time to time the driver looks in the rear view mirror to check on the location of nearby vehicles. When the driver is not looking in the mirror, vehicles in the next lane are invisible. However, in order to decide on a lane change requires that the driver know the location of vehicles in the next lane. Goal-based agents: Knowing about the state of the world is not always enough for the agent to know what to do next. For example, at an intersection, the taxi driver can either turn left, right, or go straight. Which turn it should make depends on where it is trying to get to: its goal. Goal information describes states that are desirable and that the agent should try to achieve. The agent can combine goal information with information about what its actions achieve to plan sequences of actions that achieve those goals. Search and planning are the sub fields of AI devoted to finding action sequences that achieve goals. Decision making of this kind is fundamentally different from condition-action rules, in that it involves consideration of the future. In the reflex agent design this information is not used because the designer has pre-computed the correct action for the various cases. A goal-based agent could reason that if the car in front has its brake lights on, it will slow down. From the way in which the world evolves, the only action that will achieve the goal of not hitting the braking car is to slow down. To do so requires hitting the brakes. The goal-based agent is more flexible but takes longer to decide what to do. Fig. goal based agent. Utility-based agents: Fig. Utility-based agent 16
  • 17. Goals alone are not enough to generate high-quality behavior. For example, there are many action sequences that will get the taxi to its destination, but some are quicker, safer, more reliable, cheaper, etc. Goals just provide a crude distinction between ―happy‖ and ―unhappy‖ states whereas a more general performance measure should allow a comparison of different world states. ―Happiness‖ of an agent is called utility. Utility can be represented as a function that maps states into real numbers. The larger the number of happiness then higher the utility of the state. A complete specification of the utility function allows rational decisions in two kinds of cases where goals have trouble. First, when there are conflicting goals, only some of which can be achieved, the utility function specifies the appropriate trade-off. Second, when there are several goals that the agent can aim for none of which can be achieved with certainty, utility provides a way in which the likelihood of success can be weighed up against the importance of the goals. Learning agents: 17
  • 18. Chapter 3 Knowledge Representation Representation and mapping: We have some characterisations of AI, so that when an AI problem arises, we will be able to put it into context, find the correct techniques and apply them. We have introduced the agents’ language so that we can talk about intelligent tasks and how to carry them out. We have also looked at search in the general case, which is central to AI problem solving. Most pieces of software have to deal with data of some type, and in AI we use the more grandiose title of "knowledge" to stand for data including (i) (ii) (iii) facts: such as the temperature of a patient procedures: such as how to treat a patient with a high temperature and Meaning, such as why a patient with a high temperature should not be given a hot bath. Accessing and utilizing all these kinds of information will be vital for an intelligent agent to act rationally. For this reason, knowledge representation is our final general consideration before we look at particular problem types. To a large extent, the way in which you organize information available to and generated by your intelligent agent will be dictated by the type of problem you are addressing. Often, the best ways of representing knowledge for particular techniques are known. However, as with the problem of how to search, we need a lot of flexibility in the way we represent information. Therefore, it is worth looking at four general schemes for representing knowledge, namely logic, semantic networks, production rules and frames. Knowledge representation continues to be a much-researched topic in AI because of the realization fairly early on that how information is arranged can often make or break an AI application. Knowledge Representation is a combination of data structures and interpretive procedures that will lead to "knowledgeable" behavior. This definition is not entirely satisfactory from an anthropological perspective due to the emphasis on behavior rather than the system of knowledge. But it captures the central idea that data plus rules results in knowledge. In some sense the anthropologist working in the field is attempting to acquire and analyze an alien representation of knowledge. The goals of AI and Anthropology are not identical. Most anthropologists are not interested in writing programs that have knowledgeable behavior for that purpose; they are interested in representing knowledge. Reasoning Programs Internal Representations Facts English understanding English Generation English Representation Fig. Mapping between Facts and Representations 18
  • 19. Logical Representations If all human beings spoke the same language, there would be a lot less misunderstanding in the world. The problem with software engineering in general is that there are often slips in communication which mean that what we think we've told an agent and what we've actually told it are two different things. One way to reduce this, of course, is to specify and agree upon some concrete rules for the language we use to represent information. To define a language, we need to specify the syntax of the language and the semantics. To specify the syntax of a language, we must say what symbols are allowed in the language and what are legal constructions (sentences) using those symbols. To specify the semantics of a language, we must say how the legal sentences are to be read, i.e., what they mean. If we choose a particular well defined language and stick to it, we are using a logical representation. Certain logics are very popular for the representation of information, and range in terms of their expressiveness. More expressive logics allow us to translate more sentences from our natural language into the language defined by the logic. Some popular logics are:  Propositional Logic  First Order Predicate Logic  Higher Order Predicate Logic  Fuzzy Logic Approach to knowledge representation: A good system for the representation of knowledge in a particular domain should possess the following properties: 1. Representational Adequacy- the ability to represent all of the kinds of knowledge that are needed in that domain. 2. Inferential Adequacy- the ability to manipulate the representational structures in such a way as to derive new structures corresponding to new knowledge inferred from old. 3. Inferential Efficiency- the ability to incorporate into the knowledge structure additional information that can be used to focus the attention of the inference mechanisms in the most promising directions. 4. Acquisitional Efficiency- the ability to acquire new information easily. The simplest case involves direct insertion of new knowledge into the database. Multiple techniques for knowledge representation exist. Many programs rely on more than one technique. Issues in knowledge representation: 1. Are any attributes of objects so basic that they occur in almost every problem domain? We need to make sure that they are handled appropriately in each of the mechanisms we propose. If such attributes exist, what are they? 2. Are there any important relationships that exist among attributes of objects? 3. At what level should knowledge be represented? Is there a good set of primitives into which all knowledge can be broken down? Is it helpful to use such primitives? 4. How should sets of objects be represented? 5. Given a large amount of knowledge stored in a database, how can relevant parts be accessed when they are needed? 19
  • 20. Logical Agents: In logical agent we design agents that can form representations of the world, use a process of inference to derive new representations about the world, and use these new representations to deduce what to do. Knowledge based agents: The central component of a knowledge-based agent is its knowledge base, or KB. Informally, a knowledge base is a set of sentences. (Here "sentence" is used as a technical term. It is related but is not identical to the sentences of English and other natural languages.) Each sentence is expressed in a language called a knowledge representation language and represents some assertion about the world. There must be a way to add new sentences to the knowledge base and a way to query what is known. The standard names for these tasks are TELL and ASK, respectively. Both tasks may involve inference-that is, deriving new sentences from old. In logical agents, which are the main subject of study in this chapter, inference must obey the fundamental requirement that when one ASKS a question of the knowledge base, the answer should follow from what has been told (or rather, TELL) to the knowledge base previously. Figure: A generic knowledge-based agent. Figure shows the outline of a knowledge-based agent program. Like all our agents, it takes a percept as input and returns an action. The agent maintains a knowledge base, KB, which may initially contain some background knowledge. Each time the agent program is called, it does three things. First, it TELLS the knowledge base what it perceives. Second, it ASKS the knowledge base what action it should perform. In the process of answering this query, extensive reasoning may be done about the current state of the world, about the outcomes of possible action sequences, and so on. Third, the agent records its choice with TELL and executes the action. The second TELL is necessary to let the knowledge base know that the hypothetical action has actually been executed. MAKE-PERCEPT-SENTENCE constructs a sentence asserting that the agent perceived the given percept at the given time. MAKE-ACTION-QUERY constructs a sentence that asks what action should be done at the current time. Finally, MAKE-ACTION-SENTENCE constructs a sentence asserting that the chosen action was executed. The knowledge-based agent is not an arbitrary program for calculating actions. It is amenable to a description at the knowledge level, where we need specify only what the agent knows and what its goals are, in order to fix its behavior. For example, an automated taxi might have the goal of delivering a passenger to Sitapaila and might know that it is Kalanki and that it can follow the any link between the two locations if there exist multiple links. This analysis is independent of how the taxi works at the implementation level. 20
  • 21. Formal logic-connectives, syntax, semantics:  Syntax – Rules for constructing legal sentences in the logic – Which symbols we can use (English: letters, punctuation) – How we are allowed to combine symbols – Propositions, e.g. ―it is wet‖ – Connectives: and, or, not, implies, iff (equivalent) – Brackets, T (true) and F (false)  Semantics – How we interpret (read) sentences in the logic – Assigns a meaning to each sentence – Define how connectives affect truth – ―P and Q‖ is true if and only if P is true and Q is true – Use truth tables to work out the truth of statements  Example: ―All lecturers are seven foot tall‖ – A valid sentence (syntax) – And we can understand the meaning (semantics) – This sentence happens to be false (there is a counterexample) Syntax: well-formed formulas Logical symbols: and, or, not, all, at least one, brackets, Variables, equality (=), true, false Predicate and function symbols (for example, Cat(x) for ―x is a Cat‖) Term: variables and functions (for example, Cat(x)) Formula: any combination of terms and logical symbols (for example, ―Cat(x) and Sleeps(x)‖) Sentence: formulas without free variables (for example, ―All x: Cat(x) and Sleeps(x)‖) Knowledge bases consist of sentences. These sentences are expressed according to the syntax of the representation language, which specifies all the sentences that are well formed. ―X+Y=4‖ is a well-formed sentences but not ―X4Y+=‖. There are literally dozens of different syntaxes, some with lots of Greek letters and exotic mathematical symbols, and some with rather visually appealing diagrams with arrows and bubbles. Logic must also define the semantics of the language. Semantics has to do with the "meaning" of sentences. In logic:, the definition is more precise. The semantics of the language defines the truth of each sentence with respect to each possible world. For example, the usual semantics adopted for arithmetic specifies that the sentence "x + y = 4" is true in a world where x is 2 and y is 2, but false in a world where x is 1 and y is 1. In standard logics, every sentence must be either true or false in each possible world-there is no "in between‖. When we need to be precise, we will use the term model in place of "possible world." Now that we have a notion of truth, we are ready to talk about logical reasoning. This involves the relation of logical entailment between sentences-the idea that a sentence follows 1ogicaEly from another sentence. In mathematical notation, we write as ―a |= p‖ to mean that the sentence ―a‖ entails the sentence ―p‖. The formal definition of 21
  • 22. entailment is that: a I= p if and only if, in every model in which ―a‖ is true, ―p‖ is also true. Another way to say this is that if ―a‖ is true, then ―p‖ must also be true. Informally, the truth of ―p‖ is "contained" in the truth of a. The relation of entailment is familiar from arithmetic. In arithmetic we can say that the sentence x + y = 4 entails the sentence 4 = x+ y. The property of completeness is also desirable: an inference algorithm is complete if it can derive any sentence that is entailed. For many knowledge bases the consequences is infinite, and completeness becomes an important issue. Fortunately, there are complete inference procedures for logics that are sufficiently expressive to handle many knowledge bases. We have described a reasoning process whose conclusions are guaranteed to be true in any world in which the premises are true; in particular, if KB is true in the real world, then any sentence ―a‖ derived from KB by a sound inference procedure is also true in the real world. Equivalence, validity, and satisfiability: The first concept is logical equivalence: two sentences a and b are logically equivalent if they are true in the same set of models. We write this as a≡b. The second concept we will need is validity. A sentence is valid if it is true in all models. For example, the sentence P V ¬P is valid. Valid sentences are also known as tautology. A sentence is satisfiable if it is true under at least one interpretation. The sentence is called valid if it is true under all interpretations. Similarly the sentence is invalid if it is false under some interpretation. Example: ―All x: Cat(x) and Sleeps(x)‖ If this is interpreted on an island which only has one cat that always sleeps, this is satisfiable. Since not all cats in all interpretations always sleep, the sentence is not valid. The final concept we will need is satisfiability. A sentence is satisfiable if it is true in some model. For example, the knowledge base given earlier, (R1 Ʌ R2Ʌ R3 Ʌ R4 Ʌ R5), is satisfiable because there are three models in which it is true. If a sentence a is true in a model m, then we say that m satisfies a , or that m is a model of a. Satisfiability can be checked by enumerating the possible models until one is found that satisfies the sentence. Determining the satisfiability of sentences in propositional logic was the first problem proved to be NP-complete. A|=B if and only if the sentence (AɅ¬B) is unsatisfiable. Examples of Tautology: A tautology is a redundancy, a needless repetition of an idea. For example: Best of the best. Worst of the worst of the worst. Mother of the mother of the mother This is not a teacher, this is a professor. This is not noise, this is music. This is not music, this is noise. a. The propositions α∨ ¬α and ¬(α∧¬α) are tautologies. Therefore, 1=P(α∨ ¬α) =P(α)+P( ¬α). Rearranging gives the desired result. b. The proposition α↔((α∧β)∨ (α∧¬β)) and ¬((α∧β)∧(α∧¬β)) are tautologies. Thus, P(α)=P((α∧β)∨ (α∧¬β))=P(α∧β)+P(α∧¬β). 22
  • 23. Propositional logic (very simple logic): The syntax of propositional logic defines the allowable sentences. The atomic sentences, the indivisible syntactic elements-consist of a single proposition symbol. Each such symbol stands for a proposition that can be true or false. We will use uppercase names for symbols: P, Q, R, and so on. The names are arbitrary but are often chosen to have some mnemonic value to the reader. There are two proposition symbols with fixed meanings: True is the always-true proposition and False is the always-false proposition. Complex sentences are constructed from simpler sentences using logical connectives. There are five connectives in common use: ¬ (not). A literal is either an atomic sentence (a positive literal) or a negated atomic sentence (a negative literal). Ʌ (and): A sentence whose main connective is Ʌ called a conjunction; its parts are the conjuncts. V (or): A sentence whose main connective is V called a disjunction; Historically, the V comes from the Latin "vel," which means "or."  (implies). A sentence called an implication (or conditional). Its premise or antecedent is conclusion or consequent. The implication symbol is sometimes written in other books as   (If and only if). Figure. A BNF (Backus-Naur Form) grammar of sentences in propositional logic. One possible model is: ml = {P1,2=false, P2,2 =false, P3,1= true) The semantics for propositional logic must specify how to compute the truth value of any sentence, given a model. This is done recursively. All sentences are constructed from atomic sentences and the five connectives; therefore, we need to specify how to compute the truth of atomic sentences and how to compute the truth of sentences formed with each of the five connectives. Atomic sentences are easy: True is true in every model and False is false in every model. The truth value of every other proposition symbol must be specified directly in the model. For example, in the model ml given earlier, P1,2 is false. For complex sentences, we have rules such as For any sentence s and any model m, the sentence is true in m if and only if s is false in m. Fig. truth table for five logical connectives 23
  • 24. Semantic Networks: Fig. Semantic Network • • • • • The idea behind a semantic network is that knowledge is often best understood as a set of concepts that are related to one another. The meaning of a concept is defined by its relationship to other concepts. A semantic network consists of a set of nodes that are connected by labeled arcs. The nodes represent concepts and the arcs represent relations between concepts. Common Semantic Relations: There is no standard set of relations for semantic networks, but the following relations are very common: INSTANCE: X is an INSTANCE of Y if X is a specific example of the general concept Y. Example: Elvis is an INSTANCE of Human ISA: X ISA Y if X is a subset of the more general concept Y. Example: sparrow ISA bird HASPART: X HASPART Y if the concept Y is a part of the concept X. (Or this can be any other property) Example: sparrow HASPART tail Semantic Tree: A semantic tree is a representation that is a semantic net in which Certain links are called branches. Each branch connects two nodes; the head node is called the parent node and the tail node is called the child node · One node has no parent; it is called the root node. Other nodes have exactly one parent. · Some nodes have no children; they are called leaf nodes. · When two nodes are connected to each other by a chain of two or more branches, one is said to be the ancestor; the other is said to be the descendant. Inheritance: Inheritance is a key concept in semantic networks and can be represented naturally by following ISA links. In general, if concept X has property P, then all concepts that are a subset of X should also have property P. But exceptions are pervasive in the real world! 24
  • 25. • • • • In practice, inherited properties are usually treated as default values. If a node has a direct link that contradicts an inherited property, then the default is overridden. Multiple Inheritances: Multiple inheritance allows an object to inherit properties from multiple concepts. Multiple inheritance can sometimes allow an object to inherit conflicting properties. Conflicts are potentially unavoidable, so conflict resolution strategies are needed. Predicate calculus (Predicate logic): In mathematical logic, predicate logic is the generic term for symbolic formal systems like first-order logic, second-order logic, many-sorted logic, or infinitary logic. This formal system is distinguished from other systems in that its formulae contain variables which can be quantified. Two common quantifiers are the existential ∃ ("there exists") and universal ∀ ("for all") quantifiers. The variables could be elements in the universe under discussion, or perhaps relations or functions over that universe. For instance, an existential quantifier over a function symbol would be interpreted as modifier "there is a function". In informal usage, the term "predicate logic" occasionally refers to first-order logic. Predicate calculus symbols may represent either variables, constants, functions or predicates. Constants name specific objects or properties in the domain of discourse. Thus George, tree, tall and blue are examples of well-formed constant symbols. The constants (true) and (false) are sometimes included. Functions denote a mapping of one or more elements in a set (called the domain of the function) into a unique element of another set (the range of the function). Elements of the domain and range are objects in the world of discourse. Every function symbol has an associated arity, indicating the number of elements in the domain mapped onto each element of range. A function expression is a function symbol followed by its arguments. The arguments are elements from the domain of the function; the number of arguments is equal to the arity of the function. The arguments are enclosed in parentheses and separated by commas. e.g.: f(X,Y) father(david) price(apple) First-order logic / First order predict logic (FOPL): First-order logic is sufficiently expressive to represent a good deal of our commonsense knowledge. It also either subsumes or forms the foundation of many other representation languages and has been studied intensively for many decades. This procedural approach can be contrasted with the declarative nature of propositional logic, in which knowledge and inference are separate, and inference is entirely domainindependent. Propositional logic is a declarative language because its semantics is based on a truth relation between sentences and possible worlds. It also has sufficient expressive power to deal with partial information, using disjunction and negation. Propositional logic has a third property that is desirable in representation languages, namely compositionality. 25
  • 26. Syntax and Semantics of First-Order Logic: The first-order sentence asserts that no matter what a represents, if a is a philosopher then a is scholar. Here, the universal quantifier, expresses the idea that the claim in parentheses holds for all choices of a. To show that the claim "If a is a philosopher then a is a scholar" is false, one would show there is some philosopher who is not a scholar. This counterclaim can be expressed with the existential quantifier: Here: is the negation operator: is true if and only if A is false, in other words if and only if a is not a scholar. ^ is the conjunction operator: asserts that a is a philosopher and also not a scholar. The predicates Phil(a) and Schol(a) take only one parameter each. First-order logic can also express predicates with more than one parameter. For example, "there is someone who can be fooled every time" can be expressed as: ) ∀ ( ) ))) ∃ Here Person(x) is interpreted to mean x is a person, Time(y) to mean that y is a moment of time, and Canfool(x,y) to mean that (person) x can be fooled at (time) y. For clarity, this statement asserts that there is at least one person who can be fooled at all times, which is stronger than asserting that at all times at least one person exists who can be fooled. This would be expressed as: ∀ ) ∃ ) ))) Interpretation: The meaning of a term or formula is a set of elements. The meaning of a sentence is a truth value. The function that maps a formula into a set of elements is called an interpretation. An interpretation maps an intentional description (formula/sentence) into an extensional description (set or truth value). Term: A term is a logical expression that refers to an object. Constant symbols are therefore terms, but it is not always convenient to have a distinct symbol to name every object. For example, in English we might use the expression "John's left leg" rather than giving a name to his leg. This is what function symbols are for: instead of using a constant symbol, we use LeftLeg(John). 26
  • 27. Atomic sentences Now that we have both terms for referring to objects and predicate symbols for referring to relations, we can put them together to make atomic sentences that state facts. An atomic sentence is formed from a predicate symbol followed by a parenthesized list of terms: Brother(Richard, John). Married (Father(Richard), Mother( John)) states that father of Richard married to John’s mother. Complex sentences We can use logical connectives to construct more complex sentences, just as in propositional calculus. The semantics of sentences formed with logical connectives is identical to that in the propositional case. Here are four sentences that are: Brother(LeftLeg(Richard, )J ohn) Brother(Richard, John) Ʌ Brother( John, Richard) King (Richard) V King (John) ¬King (Richard) => King (John). Quantifiers Once we have a logic that allows objects, it is only natural to want to express properties of entire collections of objects, instead of enumerating the objects by name. Quantifiers let us do this. First-order logic contains two standard quantifiers, called universal and existential. Universal quantification ( v ) "All kings are persons,'' is written in first-order logic as x King(x) => Person(x) V is usually pronounced "For all . . .". (Remember that the upside-down A stands for "all.") Thus, the sentence says, "For all x, if x is a king, then z is a person." The symbol x is called a variable. By convention, variables are lowercase letters. A variable is a term all by itself, and as such can also serve as the argument of a function-for example, LeftLeg(x). A term with no variables is called a ground term. Intuitively, the sentence V x P, where P is any logical expression, says that P is true for every object x. Existential quantification (3) Universal quantification makes statements about every object. Similarly, we can make a statement about some object in the universe without naming it, by using an existential quantifier. To say that King John has a crown on his head, we write 3 x Crown(x) Ʌ OnHead (x, John). 3x is pronounced "There exists an x such that . . ." or "For some x . . .". Intuitively, the sentence 3x P says that P is true for at least one object x. More precisely, 3 x P is true in a given model under a given interpretation if P is true in at least one extended interpretation that assigns x to a domain element. Nested quantifiers We will often want to express more complex sentences using multiple quantifiers. The simplest case is where the quantifiers are of the same type. For example, "Everybody loves somebody" means that for every person, there is someone that person loves: ∀ ∃ ) On the other hand, to say "There is someone who is loved by everyone," we write: ∃ ∀ ) Connections between∀ ∃: The two quantifiers are actually intimately connected with each other, through negation. Asserting that everyone dislikes parsnips is the same as asserting there does not exist someone who likes them, and vice versa. We can go one step further: "Everyone likes ice cream" means that there is no one who does not like ice cream: ) ) ∀ ∃ 27
  • 28. The De Morgan rules for quantified and unquantified sentences are as follows: Equality First-order logic includes one more way to make atomic sentences, other than using a predicate and terms as described earlier. We can use the equality symbol to make statements to the effect that two terms refer to the same object. For example, Father (John) = Henry Says that the object referred to by Father (John) and the object referred to by Henry are the same. Because an interpretation fixes the referent of any term, determining the truth of an equality sentence is simply a matter of seeing that the referents of the two terms are the same object. The equality symbol can be used to state facts about a given function, as we just did for the Father symbol. It can also be used with negation to insist that two terms are not the same object. To say that Richard has at least two brothers, we would write ) ∃ ) ) ) The sentence ∃ ) does not have the same meaning. The kinship domain The domain of family relationships is called kinship. This domain includes facts such as "Sita is the mother of kush" and "kush is the father of hari' and rules such as "One's grandmother is the mother of one's parent." Clearly, the objects in domain are people. We will have two unary predicates, Male and Female. Kinship relations-parenthood, brotherhood, marriage, and so on-will be represented by binary predicates: Parent, Sibling, Brother, Sister, Child, Daughter, Son, Spouse, Wife, Husband, Grandparent, Grandchild, Cousin, Aunt, and Uncle. We will use functions for Mother and Father, because every person has exactly one of each of these. We can go through each function and predicate, writing down what we know in terms of the other symbols. For example, one's mother is one's female parent: ∀m, c Mother(c) = m  Female(m) Ʌ Parent(m, c) . One's husband is one's male spouse: ∀ w,h Husband(h,w )  Male(h) Ʌ Spouse(h,w). Male and female are disjoint categories: ∀ x Male (x) ¬Female(x) . Parent and child are inverse relations: ∀p, c Parent (p, c) Child (c, p) . A grandparent is a parent of one's parent: ∀g,c Grandparent(g,c) ∃p Parent (g,P ) Ʌ Parent (p,c) Diagnostic rules: Diagnostic rules lead from observed effects to hidden causes. For finding pits, the obvious diagnostic rules say that if a square is breezy, some adjacent square must contain a pit, or ∀s Breezy(s) =>∃ r Adjacent (r, s) Ʌ Pit(r) , And that if a square is not breezy, no adjacent square contains a pit: ∀s ¬Breezy (s) => ∃r Adjacent (r, s) Ʌ Pit (r) Combining these two, we obtain the bi-conditional sentence ∃s Breezy(s)  ∃r Adjacent(r, s) Ʌ Pit (r) . 28
  • 29. Causal rules: Causal rules reflect the assumed direction of causality in the world: some hidden property of the world causes certain percepts to be generated. For example, a pit causes all adjacent squares to be breezy: ) ) ) ∀ ∀ And if all squares adjacent to a given square are pitless, the square will not be breezy: ∀s [∀r Adjacent (r, s)=>¬ Pit (r)] =>¬ Breezy ( s ) . With some work, it is possible to show that these two sentences together are logically equivalent to the biconditional sentence. The bi-conditional itself can also be thought of as causal, because it states how the truth value of Breezy is generated from the world state. Horn clauses: In computational logic, a Horn clause is a clause with at most one positive literal. Horn clauses are named after the logician Alfred Horn, who investigated the mathematical properties of similar sentences in the nonclausal form of first-order logic. Horn clauses play a basic role in logic programming and are important for constructive logic. A Horn clause with exactly one positive literal is a definite clause. A definite clause with no negative literals is also called a fact. The following is a propositional example of a definite Horn clause: Such a formula can also be written equivalently in the form of an implication: ) In the non-propositional case, all variables in a clause are implicitly universally quantified with scope the entire clause. Thus, for example: ) ) Stands for: ∀ • • • • ) )) Frames: A frame represents an entity as a set of slots (attributes) and associated values. Each slot may have constraints that describe legal values that the slot can take. A frame can represent a specific entity, or a general concept. Frames are implicitly associated with one another because the value of a slot can be another frame. 29
  • 30. Demons: • One of the main advantages of frames is the ability to include demons to compute slot values. A demon is a function that computes the value of a slot on demand. Features of Frame Representations: • Frames can support values more naturally than semantic nets (e.g. the value 25) • Frames can be easily implemented using object-oriented programming techniques. • Demons allow for arbitrary functions to be embedded in a representation. • But a price is paid in terms of efficiency, generality, and modularity! • Inheritance can be easily controlled. Comparative Issues in Knowledge Representation: • The semantics behind a knowledge representation model depends on the way that it is used (implemented). Notation is irrelevant! Whether a statement is written in logic or as a semantic network is not important -- what matters is whether the knowledge is used in the same manner. • Most knowledge representation models can be made to be functionally equivalent. It is a useful exercise to try converting knowledge in one form to another form. • From a practical perspective, the most important consideration usually is whether the KR model allows the knowledge to be encoded and manipulated in a natural fashion. Expressiveness of Semantic Nets: • Some types of properties are not easily expressed using a semantic network. For example: negation, disjunction, and general non-taxonomic knowledge. • There are specialized ways of dealing with these relationships, for example partitioned semantic networks and procedural attachment. But these approaches are ugly and not commonly used. • Negation can be handled by having complementary predicates (e.g., A and NOT A) and using specialized procedures to check for them. Also very ugly, but easy to do. • If the lack of expressiveness is acceptable, semantic nets have several advantages: inheritance is natural and modular, and semantic nets can be quite efficient. Conceptual dependencies and scripts: It is a theory of how to represent the kind of knowledge about events that is usually contained in natural language sentences. The objective is to represent the knowledge in a way that:  Facilitates drawing inferences from the sentences.  Is independent of the language in which the sentences were originally stated. The conceptual dependency representation of a sentence is built not out of primitives corresponding to the words used in the sentence, but rather out of conceptual primitives that can 30
  • 31.  be combined to form the meanings of words in any particular. Hence, the conceptual dependency  is implemented in a variety of programs that read and understand natural language text. Semantic nets provide only a structure into which nodes representing information at any level can be placed, whereas conceptual dependency provide both a structure and a specific set of primitives out of which representations of particular pieces of information can be constructed. For example, the event representation by the sentence ―I gave the man a book‖ is represented in CD as follows: to man P I R o ATRANS book from I Where the symbols have the following meanings:  arrows indicate direction of dependency  double arrow indicate direction of dependency  p indicates the past tense  ATRANS is one of the primitive acts used by the theory. It indicates transfer of possession  indicates the object case relation  R indicates the recipient case relation In CD, representations of actions are built from a set of primitive acts. Some of the typical sets of primitive actions are as follows: ATRANS transfer of an abstract relationship (example: give) PTRANS transfer of the physical location of an object (example: go) MTRANS transfer of mental information (example: tell) PROPEL application of physical force to an object (example: push) MOVE movement of a body part by its owner (example: kick) MBUILD building new information out of old (example: decide) A second set of CD building blocks is the set of allowable dependencies among the conceptualizations described in a sentence. There are four primitive conceptual categories from which dependency structures can be built. They are: ACTs action PPs objects (picture procedures) AAs modifiers of actions (action aiders) PAs modifiers of PPs (picture aiders) Scripts A script is a structure that prescribes a set of circumstances which could be expected to follow on from one another. It is similar to a thought sequence or a chain of situations which could be anticipated. It could be considered to consist of a number of slots or frames but with more specialized roles. Scripts are beneficial because:   Events tend to occur in known runs or patterns. Causal relationships between events exist. 31
  • 32.   Entry conditions exist which allow an event to take place Prerequisites exist upon events taking place. E.g. when a student progresses through a degree scheme or when a purchaser buys a house. The components of a script include: Entry Conditions these must be satisfied before events in the script can occur. Results Conditions that will be true after events in script occur. Props Slots representing objects involved in events. Roles Persons involved in the events. Track Variations on the script. Different tracks may share components of the same script. Scenes The sequence of events that occur. Events are represented in conceptual dependency form. Advantages of Scripts:  Ability to predict events.  A single coherent interpretation may be build up from a collection of observations. Disadvantages:   Less general than frames. May not be suitable to represent all kinds of knowledge. 32