Machine Learning
in Artificial Intelligence
Aman Patel
Roll no: A211
Machine Learning: Definition
 Machine learning, a branch of artificial intelligence, concerns
the construction and study of systems that can learn from
data.
 Definition: A computer program is said to learn from
experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as
measured by P, improves with experience E.
 For example, a machine learning system could be trained on
email messages to learn to distinguish between spam and
non-spam messages. After learning, it can then be used to
classify new email messages into spam and non-spam folders.
Why is Machine Learning Important?
 Some tasks cannot be defined well, except by examples
(e.g., recognizing people).
 Relationships and correlations can be hidden within large
amounts of data. Machine Learning/Data Mining may be
able to find these relationships.
 Human designers often produce machines that do not work
as well as desired in the environments in which they are used.
Why is Machine Learning Important
(Cont’d)?
 The amount of knowledge available about certain tasks
might be too large for explicit encoding by humans (e.g.,
medical diagnostic).
 Environments change over time.

 New knowledge about tasks is constantly being discovered
by humans. It may be difficult to continuously re-design
systems “by hand”.
Areas of Influence for Machine
Learning
 Statistics: How best to use samples drawn from unknown
probability distributions to help decide from which distribution
some new sample is drawn?
 Brain Models: Non-linear elements with weighted inputs
(Artificial Neural Networks) have been suggested as simple
models of biological neurons.
 Adaptive Control Theory: How to deal with controlling a
process having unknown parameters that must be estimated
during operation?
Areas of Influence for Machine
Learning (Cont’d)
 Psychology: How to model human performance on various
learning tasks?
 Artificial Intelligence: How to write algorithms to acquire the
knowledge humans are able to acquire, at least, as well as
humans?
 Evolutionary Models: How to model certain aspects of
biological evolution to improve the performance of computer
programs?
Designing a Learning System:
An Example
o Problem Description
o Choosing the Training Experience
o Choosing the Target Function
o Choosing a Representation for the Target Function
o Choosing a Function Approximation Algorithm
o Final Design
Problem Description:
A Checker Learning Problem
 Task T: Playing Checkers
 Performance Measure P: Percent of games won against
opponents
 Training Experience E: To be selected ==> Games Played
against itself
Issues in Machine Learning
 What algorithms are available for learning a concept? How
well do they perform?
 How much training data is sufficient to learn a concept with
high confidence?

 When is it useful to use prior knowledge?
 Are some training examples more useful than others?
 What are best tasks for a system to learn?
 What is the best way for a system to represent its knowledge?
Machine Learning Algorithm Types
 Machine learning algorithms can be organized into a taxonomy based on
the desired outcome of the algorithm or the type of input available during
training the machine.
 Supervised learning algorithms are trained on labelled examples, i.e.,
input where the desired output is known. The supervised learning
algorithm attempts to generalise a function or mapping from inputs to
outputs which can then be used to speculatively generate an output
for previously unseen inputs.
 Unsupervised learning algorithms operate on unlabelled examples, i.e.,
input where the desired output is unknown. Here the objective is to
discover structure in the data (e.g. through a cluster analysis), not to
generalise a mapping from inputs to outputs.
 Semi-supervised learning combines both labelled and unlabelled
examples to generate an appropriate function or classifier.
Machine Learning Algorithm Types
(Cont’d)
 Reinforcement learning is concerned with how intelligent
agents ought to act in an environment to maximise some notion of
reward. The agent executes actions which cause the observable
state of the environment to change. Through a sequence of
actions, the agent attempts to gather knowledge about how the
environment responds to its actions, and attempts to synthesise a
sequence of actions that maximises a cumulative reward.
 Developmental learning, elaborated for Robot learning, generates
its own sequences (also called curriculum) of learning situations to
cumulatively acquire repertoires of novel skills through autonomous
self-exploration and social interaction with human teachers, and
using guidance mechanisms such as active learning, maturation,
motor synergies, and imitation.
AdaBoost Algorithm
 AdaBoost, short for Adaptive Boosting, is a machine
learning algorithm, formulated by Yoav Freund and Robert
Schapire.
 It is a meta-algorithm, and can be used in conjunction with many
other learning algorithms to improve their performance.

 AdaBoost is adaptive in the sense that subsequent classifiers built
are tweaked in favour of those instances misclassified by previous
classifiers.
 AdaBoost is sensitive to noisy data and outliers.
AdaBoost - Adaptive Boosting
 Instead of resampling, uses training set re-weighting
 Each training sample uses a weight to determine the
probability of being selected for a training set.
 AdaBoost is an algorithm for constructing a “strong” classifier
as linear combination of “simple” “weak” classifier

 Final classification based on weighted vote of weak classifiers
AdaBoost Terminology


… “weak” or basis classifier
(Classifier = Learner = Hypothesis)



… “strong” or final classifier

 Weak Classifier: < 50% error over any distribution
 Strong Classifier: Thresholded linear combination of weak
classifier outputs
AdaBoost : The Algorithm
 The framework
 The learner receives examples xi , yi i 1 N chosen randomly
according to some fixed but unknown distribution P on X Y
 The learner finds a hypothesis which is consistent with most of the
for most 1 i N
samples h f xi yi

 The algorithm
 Input variables
P: The distribution where the training examples sampling from
D: The distribution over all the training samples
WeakLearn: A weak learning algorithm to be boosted
T: The specified number of iterations
AdaBoost (Cont’d)
Advantages of AdaBoost
 Very simple to implement
 Feature selection on very large sets of features

 AdaBoost adjusts adaptively the errors of the weak
hypotheses by WeakLearn.
Machine learning with ADA Boost

Machine learning with ADA Boost

  • 1.
    Machine Learning in ArtificialIntelligence Aman Patel Roll no: A211
  • 2.
    Machine Learning: Definition Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data.  Definition: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.  For example, a machine learning system could be trained on email messages to learn to distinguish between spam and non-spam messages. After learning, it can then be used to classify new email messages into spam and non-spam folders.
  • 3.
    Why is MachineLearning Important?  Some tasks cannot be defined well, except by examples (e.g., recognizing people).  Relationships and correlations can be hidden within large amounts of data. Machine Learning/Data Mining may be able to find these relationships.  Human designers often produce machines that do not work as well as desired in the environments in which they are used.
  • 4.
    Why is MachineLearning Important (Cont’d)?  The amount of knowledge available about certain tasks might be too large for explicit encoding by humans (e.g., medical diagnostic).  Environments change over time.  New knowledge about tasks is constantly being discovered by humans. It may be difficult to continuously re-design systems “by hand”.
  • 5.
    Areas of Influencefor Machine Learning  Statistics: How best to use samples drawn from unknown probability distributions to help decide from which distribution some new sample is drawn?  Brain Models: Non-linear elements with weighted inputs (Artificial Neural Networks) have been suggested as simple models of biological neurons.  Adaptive Control Theory: How to deal with controlling a process having unknown parameters that must be estimated during operation?
  • 6.
    Areas of Influencefor Machine Learning (Cont’d)  Psychology: How to model human performance on various learning tasks?  Artificial Intelligence: How to write algorithms to acquire the knowledge humans are able to acquire, at least, as well as humans?  Evolutionary Models: How to model certain aspects of biological evolution to improve the performance of computer programs?
  • 7.
    Designing a LearningSystem: An Example o Problem Description o Choosing the Training Experience o Choosing the Target Function o Choosing a Representation for the Target Function o Choosing a Function Approximation Algorithm o Final Design
  • 8.
    Problem Description: A CheckerLearning Problem  Task T: Playing Checkers  Performance Measure P: Percent of games won against opponents  Training Experience E: To be selected ==> Games Played against itself
  • 9.
    Issues in MachineLearning  What algorithms are available for learning a concept? How well do they perform?  How much training data is sufficient to learn a concept with high confidence?  When is it useful to use prior knowledge?  Are some training examples more useful than others?  What are best tasks for a system to learn?  What is the best way for a system to represent its knowledge?
  • 10.
    Machine Learning AlgorithmTypes  Machine learning algorithms can be organized into a taxonomy based on the desired outcome of the algorithm or the type of input available during training the machine.  Supervised learning algorithms are trained on labelled examples, i.e., input where the desired output is known. The supervised learning algorithm attempts to generalise a function or mapping from inputs to outputs which can then be used to speculatively generate an output for previously unseen inputs.  Unsupervised learning algorithms operate on unlabelled examples, i.e., input where the desired output is unknown. Here the objective is to discover structure in the data (e.g. through a cluster analysis), not to generalise a mapping from inputs to outputs.  Semi-supervised learning combines both labelled and unlabelled examples to generate an appropriate function or classifier.
  • 11.
    Machine Learning AlgorithmTypes (Cont’d)  Reinforcement learning is concerned with how intelligent agents ought to act in an environment to maximise some notion of reward. The agent executes actions which cause the observable state of the environment to change. Through a sequence of actions, the agent attempts to gather knowledge about how the environment responds to its actions, and attempts to synthesise a sequence of actions that maximises a cumulative reward.  Developmental learning, elaborated for Robot learning, generates its own sequences (also called curriculum) of learning situations to cumulatively acquire repertoires of novel skills through autonomous self-exploration and social interaction with human teachers, and using guidance mechanisms such as active learning, maturation, motor synergies, and imitation.
  • 12.
    AdaBoost Algorithm  AdaBoost,short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire.  It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance.  AdaBoost is adaptive in the sense that subsequent classifiers built are tweaked in favour of those instances misclassified by previous classifiers.  AdaBoost is sensitive to noisy data and outliers.
  • 13.
    AdaBoost - AdaptiveBoosting  Instead of resampling, uses training set re-weighting  Each training sample uses a weight to determine the probability of being selected for a training set.  AdaBoost is an algorithm for constructing a “strong” classifier as linear combination of “simple” “weak” classifier  Final classification based on weighted vote of weak classifiers
  • 14.
    AdaBoost Terminology  … “weak”or basis classifier (Classifier = Learner = Hypothesis)  … “strong” or final classifier  Weak Classifier: < 50% error over any distribution  Strong Classifier: Thresholded linear combination of weak classifier outputs
  • 15.
    AdaBoost : TheAlgorithm  The framework  The learner receives examples xi , yi i 1 N chosen randomly according to some fixed but unknown distribution P on X Y  The learner finds a hypothesis which is consistent with most of the for most 1 i N samples h f xi yi  The algorithm  Input variables P: The distribution where the training examples sampling from D: The distribution over all the training samples WeakLearn: A weak learning algorithm to be boosted T: The specified number of iterations
  • 16.
  • 17.
    Advantages of AdaBoost Very simple to implement  Feature selection on very large sets of features  AdaBoost adjusts adaptively the errors of the weak hypotheses by WeakLearn.