Applied Artificial Intelligence Unit 5 Semester 3 MSc IT Part 2 Mumbai University

Applied Artificial Intelligence
Unit – 5

Topics to Cover…!
• Advanced Knowledge Representation Techniques: Conceptual
dependency theory, script structure, CYC theory, case grammars,
semantic web.
• Natural Language Processing: Sentence Analysis phases, grammars
and parsers, types of parsers, semantic analysis, universal networking
language, dictionary
PPT BY: MADHAV MISHRA 2

script structures
• A script is a structured representation
describing a stereotyped sequence of events in
a particular context.
• Scripts are used in natural language
understanding systems to organize a knowledge
base in terms of the situations that the system
should understand. Scripts use a frame-like
structure to represent the commonly occurring
experience like going to the movies eating in a
restaurant, shopping in a supermarket, or
visiting an ophthalmologist.
• Thus, a script is a structure that prescribes a set
of circumstances that could be expected to
follow on from one another.

• Scripts are beneficial because:
• Events tend to occur in known runs or
patterns.
• A casual relationship between events
exist.
• An entry condition exists which allows
an event to take place.
• Prerequisites exist upon events taking
place.

Components of a script:
• The components of a script include:
• Entry condition: These are basic condition which must
be fulfilled before events in the script can occur.
• Results: Condition that will be true after events in script
occurred.
• Props: Slots representing objects involved in events
• Roles: These are the actions that the individual
participants perform.
• Track: Variations on the script. Different tracks may
share components of the same scripts.
• Scenes: The sequence of events that occur.

Describing a
script, special
symbols of
actions are used.
These are:

Example: Script for going to the bankto withdrawmoney.

Advantages of Scripts
• Ability to predict events.
• A single coherent interpretation maybe builds up
from a collection of observations.
Disadvantages of Scripts
• Less general than frames.
• May not be suitable to represent all kinds of
knowledge

• Cyc has a huge knowledge base which it uses for reasoning.
• Contains
• 15,000 predicates
• 300,000 concepts
• 3,200,000 assertions
• All these predicates, concepts and assertions are arranged in
numerous ontologies.

Cyc: Features
Uncertain Results
• Query: “who had the motive for the assassination of
Rafik Hariri?”
• Since the case is still an unsolved political mystery, there
is no way we can ever get the answer.
• In cases like these Cyc returns the various view points,
quoting the sources from which it built its inferences.
• For the above query, it gives two view points
• “USA and Israel” as quoted from a editorial in Al
Jazeera
• “Syria” as quoted from a news report from CNN

• It uses Google as the search engine in the background.
• It filters results according to the context of the query.
• For example, if we search for assassination of Rafik Hariri, then it
omits results which have a time stamp before that of the
assassination date.

Qualitative Queries
• Query: “Was Bill Clinton a good President of the United
States?”
• In cases like these, Cyc returns the results in a pros and cons
type and leave it to the user to make a conclusion.
Queries With No Answer
• Query: “At this instance of time, Is Alice inhaling or Exhaling?”
• The Cyc system is intelligent enough to figure out queries
which can never be answered correctly.

• The ultimate goal is to build enough common sense into the Cyc system
such that it can understand Natural Language.
• Once it understands Natural Language, all the system has to do is crawl
through all the online material and learn new common sense rules and
evolve.
• This two step process of building common sense and using machine
learning techniques to learn new things will make the Cyc system an
infinite source of knowledge.

Drawbacks
• There is no single Ontology that works in all
cases.
• Although Cyc is able to simulate common sense
it cannot distinguish between facts and fiction.
• In Natural Language Processing there is no way
the Cyc system can figure out if a particular word
is used in the normal sense or in the sarcastic
sense.
• Adding knowledge is a very tedious process.

Semantic Web
• The development of Semantic Web is well underway with a goal that it would be
possible for machines to understand the information on the web rather than
simply display.
• The major obstacle to this goal is the fact that most information on the web is
designed solely for human consumption. This information should be structured
in a way that machines can understand and process that information.
• The concept of machine-understandable documents does not imply “Artificial
Intelligence”. It only indicates a machine’s ability to solve well-defined problems
by performing well-defined operations on well-defined data.
• The key technological threads that are currently employed in the development
of Semantic Web are: eXtensible Markup Language (XML), Resource Description
Framework (RDF), DAML (DARPA Agent Markup Language).

• Most of the web’s content today is designed for humans to read , and not for
computer programs to process meaningfully.
• Computers can
- parse the web pages.
- perform routine processing (here a header, there a link, etc.)
• In general, they have no reliable way to understand and process the semantics.
• The Semantic Web will bring structure to the meaningful content of the web of
web pages, creating an environment where software agents roaming from page to
page carry out sophisticated tasks for users.
• The Semantic Web is not a separate web

Knowledge
Representation
• For Semantic Web to function, the computers should have access
to • Structured Collections of Information
• Meaning of this Information
• Sets of Inference Rules/Logic.
These sets of Inference rules can be used to conduct automated
reasoning.
• Technological Threads for developing the Semantic
Web:
- XML
- RDF
- Ontologies
PPT BY: MADHAV MISHRA
39

XML
• XML lets everyone to create their own tags.
• These tags can be used by the script programs in sophisticated ways to
perform various tasks, but the script writer has to know what the page
writer uses each tag for.
• In short, XML allows you to add arbitrary structure to the documents but
says nothing about what the structures mean.
• It has no built mechanism to convey the meaning of the user’s new tags to
other users.
40

• A scheme for defining information on the web. It provides the technology
for
expressing the meaning of terms and concepts in a form that computers
can
readily process.
• RDF encodes this information on the XML page in sets of triples. The
triple is an information on the web about related things.
• Each triple is a combination of Subject, Verb and Object, similar to an
elementary sentence.
• Subjects, Verbs and Objects are each identified by a URI, which enable
anyone to define a new concept/new verb just by defining a URI for it
somewhere on the web.
RDF

42
These triples can be written using XML tags as shown,
RDF (contd.)
• An RDF document can make assertions that particular things (people, web
pages or whatever) have properties ( “is a sister of”, “is the author of”) with
values (another person, another person, etc.)
• RDF uses a different URI for each specific concept. Solves the problem of
same definition but different concepts. Eg. AddressTags in an XML page.

• Ontologies are collections of statements written in a language such as RDF that define relations
between concepts and specifies logical rules for reasoning about them.
• Computers/agents/services will understand the meaning of semantic data on
a web page by following links to specified ontologies.
• Ontologies can express a large number of relationships among entities
(objects) by assigning properties to classes and allowing subclasses to inherit
such properties.
• An Ontology may express the rule,
If City Code State Code
and Address City Code then Address State Code
• Enhances the functioning of semantic web: Improves accuracy of web
searches, Easy development of programs that can tackle complicated queries.
Ontologies

Case Grammars
• Case grammars use the functional relationships between noun phrases and verbs
to conduct the more deeper case of a sentence
• Generally in our English sentences, the difference between different forms of a
sentence is quite negligible.
• In early 1970’s Fillmore gave some idea about different cases of a English
sentence.
• He extended the transformational grammars of Chomsky by focusing more on the
semantic aspects of view of a sentence.
• In case grammars a sentence id defined as being composed of a preposition P, a
modality constituent M, composed of mood, tense, aspect, negation and so on.
Thus we can represent a sentence like
Where P - Set of relationships among verbs and noun phrases i.e. P = (C=Case)
M - Modality constituent
47

Components of NLP
• There are two components of NLP as given −
Natural Language Understanding (NLU)
• Understanding involves the following tasks −
• Mapping the given input in natural language into useful representations.
• Analysing different aspects of the language.
Natural Language Generation (NLG)
• It is the process of producing meaningful phrases and sentences in the form of
natural language from some internal representation. It involves :
• Text planning − It includes retrieving the relevant content from knowledge base.
• Sentence planning − It includes choosing required words, forming meaningful
phrases, setting tone of the sentence.
• Text Realization − It is mapping sentence plan into sentence structure.

NLP Terminology
• Phonology − It is study of organizing sound systematically.
• Morphology − It is a study of construction of words from primitive
meaningful units.
• Syntax − It refers to arranging words to make a sentence. It also involves
determining the structural role of words in the sentence and in phrases.
• Semantics − It is concerned with the meaning of words and how to
combine words into meaningful phrases and sentences.
• Pragmatics − It deals with using and understanding sentences in different
situations and how the interpretation of the sentence is affected.
• Discourse − It deals with how the immediately preceding sentence can
affect the interpretation of the next sentence.
• World Knowledge − It includes the general knowledge about the world.

Sentence Analysis Phases
• Lexical Analysis − It involves identifying and analyzing the
structure of words. Lexicon of a language means the collection
of words and phrases in a language. Lexical analysis is dividing
the whole chunk of txt into paragraphs, sentences, and words.
• Syntactic Analysis (Parsing) − It involves analysis of words in
the sentence for grammar and arranging words in a manner
that shows the relationship among the words. The sentence
such as “The school goes to boy” is rejected by English syntactic
analyzer.
• Semantic Analysis − It draws the exact meaning or the
dictionary meaning from the text. The text is checked for
meaningfulness. It is done by mapping syntactic structures and
objects in the task domain. The semantic analyzer disregards
sentence such as “hot ice-cream”.

• Discourse Integration − The meaning of any sentence depends upon
the meaning of the sentence just before it. In addition, it also brings
about the meaning of immediately succeeding sentence.
• Pragmatic Analysis − During this, what was said is re-interpreted on
what it actually meant. It involves deriving those aspects of language
which require real world knowledge.

• The parse tree breaks down the sentence into structured parts so that
the computer can easily understand and process it. In order for the
parsing algorithm to construct this parse tree, a set of rewrite rules,
which describe what tree structures are legal, need to be constructed.
• These rules say that a certain symbol may be expanded in the tree by a
sequence of other symbols. According to first order logic rule, if there
are two strings Noun Phrase (NP) and Verb Phrase (VP), then the string
combined by NP followed by VP is a sentence. The rewrite rules for the
sentence are as follows −
• S → NP VP
• NP → DET N | DET ADJ N
• VP → V NP

PARSING PROCESS
• Parsing is the term used to describe the process
of automatically building syntactic analysis of a
sentence in terms of a given grammar and
lexicon.
• The resulting syntactic analysis may be used as
input to a process of semantic interpretation.
• Occasionally, parsing is also used to include both
syntactic and semantic analysis.
• The parsing process is done by the parser.
• The parsing performs grouping and labeling of
parts of a sentence in a way that displays their
relationships to each other in a proper way.
• The parser is a computer program which accepts
the natural language sentence as input and
generates an output structure suitable for
analysis.

Types of Parsing
• The parsing technique can be categorized into two types such as
- Top down Parsing
- Bottom up Parsing
Top down Parsing
Top down parsing starts with the starting symbol and proceeds towards the goal. We can say
it is the process of construction the parse tree starting at the root and proceeds towards the
leaves.
It is a strategy of analyzing unknown data relationships by hypothesizing general parse tree
structures and then considering whether the known fundamental structures are compatible
with the hypothesis.
In top down parsing words of the sentence are replaced by their categories like verb phrase
(VP), Noun phrase (NP), Preposition phrase (PP), etc.
Let us consider some examples to illustrate top down parsing. We will consider both the
symbolical representation and the graphical representation. We will take the words of the
sentences and reach at the complete sentence. For parsing we will consider the previous
symbols like PP, NP, VP, ART, N, V and so on. Examples of top down parsing are LL (Left-to-
right, left most derivation), recursive descent parser etc.

Bottom up Parsing
• In this parsing technique the process begins with the sentence and
the words of the sentence is replaced by their relevant symbols.
• It is also called shift reducing parsing.
• In bottom up parsing the construction of parse tree starts at the
leaves and proceeds towards the root.
• Bottom up parsing is a strategy for analyzing unknown data
relationships that attempts to identify the most fundamental units
first and then to infer higher order structures for them.
• This process occurs in the analysis of both natural languages and
computer languages.
• It is common for bottom up parsers to take the form of general
parsing engines that can wither parse or generate a parser for a
specific programming language given a specific of its grammar.

Semantic Analysis
• Semantic Analysis is the process of drawing meaning from text.
• It allows computers to understand and interpret sentences, paragraphs, or
whole documents, by analysing their grammatical structure, and identifying
relationships between individual words in a particular context.
• It’s an essential sub-task of Natural Language Processing (NLP) and the
driving force behind machine learning tools like chatbots, search engines,
and text analysis.
• Semantic analysis-driven tools can help companies automatically extract
meaningful information from unstructured data, such as emails, support
tickets, and customer feedback.

How Semantic Analysis Works
• Lexical semantics plays an important role in semantic analysis, allowing
machines to understand relationships between lexical items (words, phrasal
verbs, etc.):
• Hyponyms: specific lexical items of a generic lexical item (hypernym) e.g.
orange is a hyponym of fruit (hypernym).
• Meronomy: a logical arrangement of text and words that denotes a
constituent part of or member of something e.g., a segment of an orange
• Polysemy: a relationship between the meanings of words or phrases,
although slightly different, share a common core meaning e.g. I read a paper,
and I wrote a paper)
• Synonyms: words that have the same sense or nearly the same meaning as
another, e.g., happy, content, ecstatic, overjoyed
• Antonyms: words that have close to opposite meanings e.g., happy, sad
• Homonyms: two words that are sound the same and are spelled alike but
have a different meaning e.g., orange (color), orange (fruit)
64

• Semantic analysis also takes into account signs and symbols (semiotics)
and collocations (words that often go together).
• Automated semantic analysis works with the help of machine learning
algorithms.
• By feeding semantically enhanced machine learning algorithms with
samples of text, you can train machines to make accurate predictions
based on past observations.
• There are various sub-tasks involved in a semantic-based approach for
machine learning, including word sense disambiguation and relationship
extraction:
Word Sense Disambiguation & Relationship Extraction

Word Sense Disambiguation:
• The automated process of identifying in which sense is a word used
according to its context.
• Natural language is ambiguous and polysemic; sometimes, the same
word can have different meanings depending on how it’s used.
• The word “orange,” for example, can refer to a color, a fruit, or even a
city in Florida!
• The same happens with the word “date,” which can mean either a
particular day of the month, a fruit, or a meeting.

• Relationship Extraction
• This task consists of detecting the semantic relationships present in a
text. Relationships usually involve two or more entities (which can be
names of people, places, company names, etc.). These entities are
connected through a semantic category, such as “works at,” “lives in,”
“is the CEO of,” “headquartered at.”
• For example, the phrase “Steve Jobs is one of the founders of Apple,
which is headquartered in California” contains two different
relationships:

Dictionary
• Also Known as UNL Dictionary.
• It stores concepts, represented by the language words.
• It stores universal words for identifying concepts, words headings that can express concepts and
information on the syntactical behaviour.
• Each entry consists of a correspondence between a concept and a word along with information
concerning syntactic properties.
• The Grammar for defining words of the language in the dictionary is shown below

Applied Artificial Intelligence Unit 5 Semester 3 MSc IT Part 2 Mumbai University

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Applied Artificial Intelligence Unit 5 Semester 3 MSc IT Part 2 Mumbai University

Similar to Applied Artificial Intelligence Unit 5 Semester 3 MSc IT Part 2 Mumbai University (20)

Recently uploaded

Recently uploaded (20)

Applied Artificial Intelligence Unit 5 Semester 3 MSc IT Part 2 Mumbai University