Learning

Learning
Amar Jukuntla
Vignan’s Foundation For Science, Technology and Research
(Deemed to be University)

Definition
•Learning covers a broad range of processes
•To gain
• Knowledge
• Or understand of
• Or skill in
• By study, instruction, or experience

Learning
• Learning is essential for unknown environments,
• i.e., when designer lacks omniscience
• Learning is useful as a system construction method,
• i.e., expose the agent to reality rather than trying to write it down
• Learning modifies the agent's decision mechanisms to improve
performance
An agent is learning if it improves its performance on future tasks after making observations
about the world

Why Learning?
•Three main reasons
•First, the designers cannot anticipate all possible
situations that the agent might find itself in.
•Second, the designers cannot anticipate all changes
over time.
•Third, sometimes human programmers have no idea
how to program a solution themselves.

Forms of Learning
• Any component of an agent can be improved by learning
from data. The improvements, and the techniques used to
make them, depend on four major factors:
• Which component is to be improved
• What prior knowledge the agent already has.
• What representation is used for the data and the component.
• What feedback is available to learn from.

Components to be
learned
• The components of these agents include:
 A direct mapping from conditions on the current state to
actions.
 A means to infer relevant properties of the world from the
percept sequence.
 Information about the way the world evolves and about the
results of possible actions the agent can take.
 Utility information indicating the desirability of world states.
 Action-value information indicating the desirability of actions.
 Goals that describe classes of states whose achievement
maximizes the agent’s utility.

Representatio
n and prior
knowledge
• We have seen
several examples of
representations for
agent components:
propositional and
first-order logical
sentences for the
components in a
logical agent;

Feedback to learn from
• There are three types of feedback that determine the three main
types of learning:
• Supervised learning: correct answers for each example (or) the agent
observes some example input–output pairs and learns a function that
maps from input to output
• Unsupervised learning: correct answers not given (or) the agent
learns patterns in the input even though no explicit feedback is
supplied. Example: Clustering.
• Reinforcement learning: occasional positive and/or negative rewards
(or) the agent learns from a series of reinforcements—rewards or
punishments.

Example
Performance
Element
Component Representation Feedback
Alpha-Beta Pruning Eval. Funtion Weighted Linear
Function
Win/Loss
Logical Agent Transition Model Successor State-
axioms
Outcome
Utility-Based Agent Transition Model Dynamic Bayes
Network
Outcome
Simple Reflex
Agent
Percept-action Neural Network Correction action

Summary of
Learning
• Any situation in which both the inputs and
outputs of a component can be perceived is
called supervised learning.
• In learning the condition-action component, the
agent receives some evaluation of its action but
is not told the correct action. This is
called reinforcement learning.
• Learning when there is no hint at all about the
correct outputs is called unsupervised
learning.

Inductive Learning
This involves the process of learning by example -- where a system tries to
induce a general rule from a set of observed instances.

Continue…
• This involves classification -- assigning, to a
particular input, the name of a class to which it
belongs. Classification is important to many problem
solving tasks.
• A learning system has to be capable of evolving its
own class descriptions:
• Initial class definitions may not be adequate.
• The world may not be well understood or rapidly changing.
• The task of constructing class definitions is called
induction or concept learning

Continue…
• Simplest form: learn a function from examples f is the
target function An example is a pair (x, f(x))
• Problem: find a hypothesis h such that h ≈ f given a training
set of examples
• This is a highly simplified model of real learning:
• Ignores prior knowledge
• Assumes a deterministic, observable environment
• Assumes examples are given
• Assume that the agent wants to learn m f (why?)

Continue…
• Construct/adjust h to agree with f on training set
• (h is consistent if it agrees with f on all examples)
• E.g., curve fitting:
The simplest hypothesis consistent with the data is called Ockham’s razor

Learning
Decision Trees Decision tree induction is one of
the simplest and yet most
successful forms of machine
learning.

Continue…
• We first describe the representation—
the hypothesis space—and then show
how to learn a good hypothesis.

Representation
• A Decision Tree takes as input an object given
by a set of properties, output a Boolean value
(yes/no decision).
• Each internal node in the tree corresponds to
test of one of the properties. Branches are
labelled with the possible values of the test.
• Aim: Learn goal concept (goal predicate)
from examples
• Learning element: Algorithm that builds up
the decision tree.
• Performance element: decision procedure
given by the tree

Expressivene
ss of Decision
Trees

Continue…
• A Boolean decision tree is logically equivalent to the assertion that the
goal attribute is true, if and only if the input attributes satisfy one of
the paths leading to a leaf with value true. Writing this out in
propositional logic, we have
Goal ⇔ (Path1 ∨ Path2 ∨ . . . .)
• where each Path is a conjunction of attribute-value tests required to
follow that path. Thus, the whole expression is equivalent to
disjunctive normal form which means that any function in
propositional logic can be expressed as a decision tree.
Path = (Patrons =Full ∧ WaitEstimate =0–10)

Example
• Problem to wait for a table at a restaurant. A decision tree
decides whether to wait or not in a given situation.
• Attributes:
• Alternate: alternative restaurant nearby
• Bar: bar area to wait
• Fri/Sat: true on Fridays and Saturdays
• Hungry: whether we are hungry
• Patrons: how many people in restaurant (none, some, or full)
• price: price range (£ , £ £ , £ £ £ )

Continue…
• raining: raining outside
• reservation: whether we made a reservation
• type: kind of restaurant (French, Italian, Thai, or Burger)
• WaitEstimate: estimated wait (<10, 10-30,30-60,>60)

How to pick nodes?
• For a training set containing p positive examples and n negative
examples, we have:
H
𝒑
𝒑+𝒏
,
𝒏
𝒑+𝒏
= −
𝒑
𝒑+𝒏
𝐥𝐨𝐠 𝟐
𝒑
𝒑+𝒏
−
𝒏
𝒑+𝒏
𝐥𝐨𝐠 𝟐
𝒏
𝒑+𝒏
• A chosen attribute A, with K distinct values, divides the training set E
into subsets E1, … , EK.
• The Expected Entropy (EH) remaining after trying attribute A (with
branches i=1,2,…,K) is
EH(A)= 𝒊=𝟏
𝒌 𝒑 𝒊+𝒏 𝒊
𝒑+𝒏
𝑯
𝒑
𝒑+𝒏
,
𝒏
𝒑+𝒏

Continue…
• Information gain (I) or reduction in entropy for this attribute is:
𝐼 𝐴 =H
𝒑
𝒑+𝒏
,
𝒏
𝒑+𝒏
-EH(A)
• Example;
• I(Patrons)=H
𝟔
𝟏𝟐
,
𝟔
𝟏𝟐
-
2
12
H
𝟎
𝟐
,
𝟐
𝟐
+
4
12
H
𝟒
𝟒
,
𝟎
𝟒
+
6
12
H
𝟐
𝟔
,
𝟒
𝟔
= 𝟎. 𝟓𝟒𝟏𝒃𝒊𝒕𝒔

How to select next node??
• Given Patrons as root node, the next attribute chosen is Hungry?,
with IG(Hungry?) = I(1/3, 2/3) – ( 2/3*1 + 1/3*0) = 0.252

Final decision tree induced by 12-
example training set

Continue…
•Decision Tree Pruning
•Cross Validation
•Training Sets
•Test Cases
•Validation Set

Continue…
•Ensemble learning is a machine learning
paradigm where multiple learners are trained to
solve the same problem. In contrast to ordinary
machine learning approaches which try to learn
one hypothesis from training data, ensemble
methods try to construct a set of hypotheses and
combine them to use.

Continue…
• Ensemble learning helps improve machine learning results by
combining several models.
• This approach allows the production of better predictive
performance compared to a single model.
• Ensemble methods are meta-algorithms that combine several machine
learning techniques into one predictive model in order
to decrease variance(bagging), bias (boosting), or improve
predictions (stacking).

Boosting
• Boosting refers to a family of algorithms that are able to convert weak
learners to strong learners.
• The main principle of boosting is to fit a sequence of weak learners−
models that are only slightly better than random guessing, such as
small decision trees− to weighted versions of the data.
• More weight is given to examples that were misclassified by earlier
rounds.
• The predictions are then combined through a weighted majority vote
(classification) or a weighted sum (regression) to produce the final
prediction.

Continue…
• The principal difference between boosting and the committee methods,
such as bagging, is that base learners are trained in sequence on a
weighted version of the data.
• The algorithm below describes the most widely used form of boosting
algorithm called AdaBoost, which stands for adaptive boosting.

Continue…
• Natural Language Processing (NLP) refers to AI method of
communicating with an intelligent systems using a natural language
such as English.
• Processing of Natural Language is required when you want an
intelligent system like robot to perform as per your instructions, when
you want to hear decision from a dialogue based clinical expert
system, etc.
• The field of NLP involves making computers to perform useful tasks
with the natural languages humans use. The input and output of an
NLP system can be −
• Speech
• Written Text

Continue…
• To process written text, we need:
• lexical, syntactic, semantic knowledge about the language
• discourse information, real world knowledge
• To process spoken language, we need everything required
to process written text, plus the challenges of speech recognition
and speech synthesis.

Components of NLP
•Natural Language Understanding (NLU)
•Understanding involves the following tasks −
• Mapping the given input in natural language into useful
representations.
• Analyzing different aspects of the language.

Continue...
•Natural Language Generation (NLG)
• It is the process of producing meaningful phrases and
sentences in the form of natural language from some internal
representation.
• It involves −
• Text planning − It includes retrieving the relevant content from
knowledge base.
• Sentence planning − It includes choosing required words,
forming meaningful phrases, setting tone of the sentence.
• Text Realization − It is mapping sentence plan into sentence
structure.
The NLU is harder than NLG.

Difficulties in NLU
• NL has an extremely rich form and structure.
• It is very ambiguous. There can be different levels of
ambiguity −
• Lexical ambiguity − It is at very primitive level such as
word-level.
• For example, treating the word “board” as noun or verb?
• Syntax Level ambiguity − A sentence can be parsed in
different ways.

Continue…
• For example, “He lifted the beetle with red cap.” − Did he use cap to
lift the beetle or he lifted a beetle that had red cap?
• Referential ambiguity − Referring to something using pronouns. For
example, Rima went to Gauri. She said, “I am tired.” − Exactly who is
tired?
• One input can mean different meanings. Many inputs can mean the
same thing.

NLP Terminology
• Phonology − It is study of organizing sound systematically.
• Morphology − It is a study of construction of words from
primitive meaningful units.
• Morpheme − It is primitive unit of meaning in a language.
• Syntax − It refers to arranging words to make a sentence. It
also involves determining the structural role of words in the
sentence and in phrases.

Continue…
• Semantics − It is concerned with the meaning of words and how to
combine words into meaningful phrases and sentences.
• Pragmatics − It deals with using and understanding sentences in
different situations and how the interpretation of the sentence is
affected.
• Discourse − It deals with how the immediately preceding sentence can
affect the interpretation of the next sentence.
• World Knowledge − It includes the general knowledge about the
world.

Steps in NLP
•There are general five steps −
•Lexical Analysis
•Syntactic Analysis (Parsing)
•Semantic Analysis
•Discourse Integration
•Pragmatic Analysis

Continue…
• Lexical Analysis − It involves identifying and analyzing the structure
of words. Lexicon of a language means the collection of words and
phrases in a language. Lexical analysis is dividing the whole chunk of
txt into paragraphs, sentences, and words.
• Syntactic Analysis (Parsing) − It involves analysis of words in the
sentence for grammar and arranging words in a manner that shows the
relationship among the words. The sentence such as “The school goes
to boy” is rejected by English syntactic analyzer.

Continue…
• Semantic Analysis − It draws the exact meaning or the dictionary
meaning from the text. The text is checked for meaningfulness. It is
done by mapping syntactic structures and objects in the task domain.
The semantic analyzer disregards sentence such as “hot ice-cream”.
• Discourse Integration − The meaning of any sentence depends upon
the meaning of the sentence just before it. In addition, it also brings
about the meaning of immediately succeeding sentence.
• Pragmatic Analysis − During this, what was said is re-interpreted on
what it actually meant. It involves deriving those aspects of language
which require real world knowledge.

Lexical Analysis1
Syntactic Analysis2
Semantic Analysis3
Discourse Integration4
Pragmatic Analysis5

Introduction
•A computer program is said to learn from a
experience E with respect to the some class
of tasks T and performance measure P, if
its performance on tasks in T, on tasks in T
as measured by P improves with experience
E.
DataPrediction
Classification
acting

Feed the Experience or Data Problem or Task
Background Knowledge
(Which will help the system)
Solution and
its corresponding performance measure
Learner Reasoner
Models

Applications
Image Recognition
Speech
Recognition
Medical Diagnosis
Classification
Prediction
Regression
Extraction
Fraud Detection

Creating a Learner
Choose
Training
Experience
• Training Data
Choose Target
Function
• How to represent a
model (That is to be
learned)
Choose how to
represent a
target function
Choose a
learning
Algorithm

Different Types of Learning
Supervised Learning1
Unsupervised Learning2
Semi Supervised Learning3
Reinforcement Learning4

Supervised Learning
• This kind of learning is
possible when inputs and
the outputs are clearly
identified, and algorithms
are trained using labeled
examples.

Unsupervised
Learning
• Unlike supervised learning,
unsupervised learning is used with
data sets without historical data.
An unsupervised learning
algorithm explores surpassed data
to find the structure. This kind of
learning works best for
transactional data; for instance, it
helps in identifying customer
segments and clusters with certain
attributes

Semi-Supervised
Learning
• As the name suggests,
semi-supervised learning is
a bit of both supervised
and unsupervised learning
and uses both labeled and
unlabeled data for
training. In a typical
scenario, the algorithm
would use a small amount
of labeled data with a large
amount of unlabeled data.

Reinforcement Learning
• This is a bit similar to the traditional type of data analysis; the
algorithm discovers through trial and error and decides which
action results in greater rewards. Three major components
can be identified in reinforcement learning functionality: the
agent, the environment, and the actions. The agent is the
learner or decision-maker, the environment includes
everything that the agent interacts with, and the actions are
what the agent can do.

Reinforcement
Learning
• Reinforcement
learning occurs when
the agent chooses
actions that maximize
the expected reward
over a given time. This
is best achieved when
the agent has a good
policy to follow.

Perception
•Perception appears to be an effortless
activity for humans, it requires a significant
amount of sophisticated computation.
•The goal of vision is to extract information
needed for tasks such as manipulation,
navigation, and object recognition.

Why image processing???
•It is motivated by two major applications
•Improvement of pictorial information for
human perception.
•Image processing for autonomous machine
applications.
•Efficient storage and transmission.

Basic Steps of Image Processing
• Image Acquisition: An imaging sensor and the capability to
digitalize the signal produced by the sensor.
• Preprocessing: Enhance the image quality, filtering, contrast
enhancement etc.
• Segmentation: Partitions an image into constituent parts of
objects.
• Description/ Feature selection: Extracts description of image
objects suitable for further computer processing.

Continue…
•Recognition & interpretation: Assigning a label
to the object based on the information provided
by its descriptor. Interpretation assigns meaning
to a set of labeled objects.
•Knowledge Base: KB helps for efficient
processing as well as inter module cooperation.

Image
Acquisition
Processing
Segmentation
Representation
and
Description
Recognition
and
Interpretation
Knowledge Base
Result

References
• Artificial Intelligence Natural Language Processing - TutorialsPoint
• http://web.cs.hacettepe.edu.tr/~ilyas/Courses/BIL711/lec01-
overview.PPT
• https://www.upwork.com/hiring/for-clients/artificial-intelligence-and-
natural-language-processing-in-big-data/
• https://www.youtube.com/watch?v=T3PsRW6wZSY&list=PLlGkyYYWOSOs
GU-XARWdIFsRAJQkyBrVj
• https://www.simplilearn.com/what-is-machine-learning-and-why-it-
matters-article
• https://www.youtube.com/watch?v=CVV0TvNK6pk

References
• http://users.cs.cf.ac.uk/Dave.Marshall/AI2/node144.html.
• https://nptel.ac.in/courses/106106126/
• http://www2.hawaii.edu/~nreed/ics461/lectures/18learning.p
df
• http://www.cs.bham.ac.uk/~mmk/Teaching/AI/l3.html
• https://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/springer
EBR09.pdf
• https://blog.statsbot.co/ensemble-learning-d1dcd548e936
• https://www.scm.tees.ac.uk/isg/aia/nlp/NLP-overview.pdf

Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Learning

Similar to Learning (20)

More from Amar Jukuntla

More from Amar Jukuntla (17)

Recently uploaded

Recently uploaded (20)

Learning