Introduction to Machine Learning

Machine Learning Techniques
Introduction
Dr. Radhey Shyam
Professor
Department of Computer Science & Engineering
BIET Lucknow
Following slides have been prepared by Radhey Shyam, with grateful acknowledgement of others who made their course
contents freely available. Feel free to reuse these slides for your own academic purposes. Please send feedback at
shyam0058@gmail.com
August 31, 2020
Dr. Radhey Shyam (BIET Lucknow-AKTU) Machine Learning Techniques August 31, 2020 1 / 39

OutlineOutline
1 Introduction
2 Types of Learning
3 Designing a Learning System
4 History of Machine Learning
5 Function Representation Techniques
6 Search/Optimization Algorithms
7 Evaluation Parameters
8 Introduction of Machine Learning Approaches
9 Issues in Machine Learning
Data Science vs. Machine Learning
10 Conclusions
11 References

IntroductionIntroduction
What is Machine Learning?
“Learning is any process by which a system improves performance from
experience.”
- Herbert Simon
Machine Learning is the study of algorithms that
improve their performance P
at some task T
with experience E.
A well-deﬁned learning task is given by < P, T, E >.
-Tom Mitchell (1998)

IntroductionIntroduction (cont.)
Traditional Programming
Machine Learning

When Do We Use Machine Learning?
ML is used when:
Human expertise does not exist (navigating on Mars)
Humans can’t explain their expertise (speech recognition)
Models must be customized (personalized medicine)
Models are based on huge amounts of data (genomics)
Learning is not always useful:
There is no need to “learn” to calculate payroll

A classic example of a task that requires machine learning:
It is very hard to say what makes a 2

Some more examples of tasks that are best solved by using a
learning algorithm
Recognizing patterns:
Facial identities or facial expressions
Handwritten or spoken words
Medical images
Generating patterns:
Generating images or motion sequences
Recognizing anomalies:
Unusual credit card transactions
Unusual patterns of sensor readings in a nuclear power plant
Prediction:
Future stock prices or currency exchange rates

Some Sample Applications
Web search
Computational biology
Finance
E-commerce
Space exploration
Robotics
Information extraction
Social networks
Debugging software
Your intended area

Samuel’s Checkers-Player
“Machine Learning: Field of study that gives computers the ability to learn
without being explicitly programmed.”
-Arthur Samuel (1959)

Deﬁning the Learning Task
Improve on task T, with respect to performance metric P, based on
experience E
A checkers learning problem:
T: Playing checkers
P: Percentage of games won against an arbitrary opponent
E: Playing practice games against itself
A handwriting recognition learning problem:
T: Recognizing hand-written words
P: Percentage of words correctly classiﬁed
E: Database of human-labeled images of handwritten words

A robot driving learning problem:
T: Driving on four-lane highways using vision sensors
P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.
Email spam ﬁltering learning problem:
T: Categorize email messages as spam or legitimate.
P: Percentage of email messages correctly classiﬁed.
E: Database of emails, some with human-given labels

State of the Art Applications of Machine Learning
Autonomous Cars
!Nevada, constituent state of the United States of America.

Autonomous Car Sensors

Autonomous Car Technology

Deep Learning in the Headlines

Types of LearningTypes of Learning

Types of LearningTypes of Learning (cont.)
Supervised (inductive) learning: regression
and classiﬁcation
Given: training data + desired outputs
(labels)
Given (x1, y1), (x2, y2), ..., (xn, yn)
Learn a function f(x) to predict y given x
y is real-valued == regression
Given (x1, y1), (x2, y2), ..., (xn, yn)
Learn a function f(x) to predict y given x
y is categorical == classiﬁcation

Unsupervised learning: clustering
Given: training data (without desired outputs)
Given x1, x2, ..., xn (without labels)
Output hidden structure behind the x’s
E.g., clustering
Social network analysis
Astronomical data analysis
Organize computing clusters
Market segmentation
Genomics application: group individuals by
genetic similarity
Independent component analysis: separate a
combined signal into its original sources

Semi-supervised learning
Given: training data + a few
desired outputs
It combines a small amount of
labeled data with a large amount of
unlabeled data during training.
Reinforcement learning
Rewards from sequence of actions
Given a sequence of states and actions with
(delayed) rewards, output a policy
Policy is a mapping from states → actions
that tells you what to do in a given state
Examples:
Credit assignment problem
Game playing
Robot in a maze
Balance a pole on your hand

Deep Learning and Deep Reinforcement Larning
Thus, came the deep learning where the human brain is simulated in
the Artiﬁcial Neural Networks (ANN) created in our binary computers.
The machine now learns on its own using the high computing power
and huge memory resources that are available today.
It is now observed that Deep Learning has solved many of the
previously unsolvable problems.
The technique is now further advanced by giving incentives to Deep
Learning networks as awards and there ﬁnally comes Deep
Reinforcement Learning.

Designing a Learning SystemDesigning a Learning System
Choose the training experience
Choose exactly what is to be learned
i.e. the target function
Choose how to represent the target function
Choose a learning algorithm to infer the target function from the
experience

Designing a Learning SystemDesigning a Learning System (cont.)
Training vs. Test Distribution
We generally assume that the training and test examples are
independently drawn from the same overall distribution of data
We call this “i.i.d” which stands for “independent and identically
distributed”
If examples are not independent, requires collective classification
If test distribution is different, requires transfer learning which is a
research problem in machine learning that focuses on storing
knowledge gained while solving one problem and applying it to a
different but related problem. For example, knowledge gained
while learning to recognize cars could apply when trying to
recognize trucks.

History of Machine LearningHistory of Machine Learning
1950s
Samuel’s checker player
Selfridge’s Pandemonium
1960s:
Neural networks: Perceptron
Pattern recognition
Learning in the limit theory
Minsky and Papert prove limitations of Perceptron
1970s:
Symbolic concept induction
Winston’s arch learner
Expert systems and the knowledge acquisition bottleneck
ID3
Michalski’s AQ and soybean diagnosis
Scientiﬁc discovery with BACON
Mathematical discovery with AM

History of Machine LearningHistory of Machine Learning (cont.)
1980s:
Advanced decision tree and rule learning
Explanation-based Learning
Learning and planning and problem solving
Utility problem
Analogy
Cognitive architectures
Resurgence of neural networks (connectionism, backpropagation)
PAC Learning Theory
Focus on experimental methodology
1990s
Data mining
Adaptive software agents and web applications
Text learning
Reinforcement learning
Inductive Logic Programming
Ensembles: Bagging, Boosting, and Stacking

History of Machine LearningHistory of Machine Learning (cont.)
Bayes Net learning
2000s
Support vector machines & kernel methods
Graphical models
Statistical relational learning
Transfer learning
Learning in robotics and vision
2010s
Deep learning systems
Learning for big data
Bayesian methods
Multi-task & lifelong learning
Applications to vision, speech, social networks, learning to read,
etc.
???

Function Representations TechniquesFunction Representations Techniques
Numerical functions
Linear regression
Neural networks
Support vector machines
Symbolic functions
Decision trees
Rules in propositional logic
Rules in ﬁrst-order predicate logic
Instance-based functions
Nearest-neighbor
Case-based Reasoning
Probabilistic Graphical Models
Na¨ıve Bayes
Bayesian networks
Hidden-Markov Model
Probabilistic Context Free Grammars
Markov networks

Search/Optimization AlgorithmsSearch/Optimization Algorithms
Gradient descent
Perceptron
Backpropagation
Dynamic Programming
HMM Learning
PCFG Learning
Divide and Conquer
Decision tree induction
Rule learning
Evolutionary Computation
Genetic Algorithms (GAs)
Genetic Programming (GP)
Neuro-evolution

Evaluation ParametersEvaluation Parameters
Accuracy
Precision and recall
Squared error
Likelihood
Posterior probability
Cost/Utility
Margin
Entropy
K-L divergence
etc.

Introduction of Machine Learning ApproachesIntroduction of Machine Learning Approaches
Artificial Neural Network—Neural networks, which are simplified
models of the biological neuron system, is a massively parallel
distributed processing system made up highly interconnected neural
computing elements that has ability to learn and thereby acquire
knowledge and make it available for use.
Clustering—is a Machine Learning technique that involves the
grouping of data points. Given a set of data points, we can use a
clustering algorithm to classify each data point into a specific group.
In theory, data points that are in the same group should have similar
properties and/or features, while data points in different groups
should have highly dissimilar properties and/or features. Clustering is
a method of unsupervised learning and is a common technique for
statistical data analysis used in many fields.

Introduction of Machine Learning ApproachesIntroduction of Machine Learning Approaches (cont.)
Reinforcement Learning—is concerned with how agents take
actions in an environment and changing state so as to maximize some
notion of cumulative reward.
Decision Tree Learning—A decision tree is
drawn upside down with its root at the top.
Nodes in the decision trees represent features,
edges represent feature values of feature in-
tervals (decision options) and leaves represent
either discrete or continuous outcomes (clas-
siﬁcation and regression).
Bayesian Networks—A Bayesian Belief Net-
work(BBN) is a probabilistic graphical model that
represents a set of variables and their conditional de-
pendencies via a Directed Acyclic Graph (DAG). BBNs
enable us to model and reason about uncertainty.

Support Vector Machine—The objective
of the support vector machine algorithm is
to find a hyperplane in an N-dimensional
space(N — the number of features) that dis-
tinctly classifies the data points. To separate
the two classes of data points, there are many
possible hyperplanes that could be chosen.
Our objective is to find a plane that has the
maximum margin, i.e the maximum distance
between data points of both classes. Maxi-
mizing the margin distance provides some re-
inforcement so that future data points can be
classified with more confidence.

Genetic Algorithm—Genetic algorithm is a spe-
ciﬁc and early representative for a class of com-
putational model called Evolutionary computing.
Genetic algorithms are commonly used to gener-
ate global solutions to optimization/search prob-
lem.

Issues in Machine LearningIssues in Machine Learning
The checkers example raises a number of generic questions about machine
learning which are as follows:
What algorithms exist for learning general target functions from
specific training examples? In what settings will particular algorithms
converge to the desired function, given sufficient training data?
Which algorithms perform best for which types of problems and
representations?
How much training data is sufficient? What general bounds can be
found to relate the confidence in learned hypotheses to the amount of
training experience and the character of the learner’s hypothesis
space?
When and how can prior knowledge held by the learner guide the
process of generalizing from examples? Can prior knowledge be
helpful even when it is only approximately correct?

Issues in Machine LearningIssues in Machine Learning (cont.)
What is the best strategy for choosing a useful next training
experience, and how does the choice of this strategy alter the
complexity of the learning problem?
What is the best way to reduce the learning task to one or more
function approximation problems? Put another way, what specific
functions should the system attempt to learn? Can this process itself
be automated?
How can the learner automatically alter its representation to improve
its ability to represent and learn the target function?
Data Science vs. Machine Learning
Data Science— data science is a field of study that aims to use a
scientific approach to extract meaning and insights from data.
It uses various techniques from many fields like mathematics, machine
learning, computer programming, statistical modeling, data engineering
and visualization, pattern recognition and learning, uncertainty
modeling, data warehousing, and cloud computing.

Data Science does not necessarily involve big data, but the fact that
data is scaling up makes big data an important aspect of data science.
Data science is the most widely used technique among AI, ML and
itself. The practitioners of data science are usually skilled in
mathematics, statistics, and programming.
Data scientists solve complex data problems to bring out insights and
correlation relevant to a business.
Machine Learning—Machine learning creates a useful model or
program by autonomously testing many solutions against the available
data and finding the best fit for the problem.
This means machine learning is great at solving problems that are
extremely labor intensive for humans. It can inform decisions and make
predictions about complex topics in an efficient and reliable way.
These strengths make machine learning useful in a huge number of
different industries.
The possibilities for machine learning are vast.

This technology has the potential to save lives and solve
important problems in healthcare, computer security and many more.
Google, always on the cutting edge, has decided to integrate machine
learning into everything they do to stay ahead of the curve.

ConclusionsConclusions
Machine Learning is a technique of training machines to perform the
activities a human brain can do, albeit bit faster and better than an
average human-being. Today we have seen that the machines can
beat human champions in games such as Chess, AlphaGO, which are
considered very complex. You have seen that machines can be trained
to perform human activities in several areas and can aid humans in
living better lives.
Machine Learning can be a Supervised or Unsupervised. If you have
lesser amount of data and clearly labelled data for training, opt for
Supervised Learning. Unsupervised Learning would generally give
better performance and results for large data sets. If you have a huge
data set easily available, go for deep learning techniques. You also
have learned Reinforcement Learning and Deep Reinforcement
Learning. You now know what Neural Networks are, their applications
and limitations.

ReferencesReferences
[1] Thomas M. Mitchell.
Machine Learning.
McGraw-Hill, Inc., USA, 1 edition, 1997.
[2] Christopher M. Bishop.
Pattern Recognition and Machine Learning.
Springer, 2006.
[3] Stephen Marsland.
Machine Learning: An Algorithmic Perspective, Second Edition.
2nd edition, 2014.
[4] Ethem Alpaydin.
Introduction to Machine Learning.
Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, 3 edition, 2014.

Thank You!

Introduction to Machine Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to Machine Learning

Similar to Introduction to Machine Learning (20)

More from Dr. Radhey Shyam

More from Dr. Radhey Shyam (20)

Recently uploaded

Recently uploaded (20)

Introduction to Machine Learning