Machine Learning
1
ML is a Technology …
ML Case studies :
a) Predicting house prices
b) Sentiment Analysis
c) Document Retrieval
d) Product Recommendation
e) Visual Product Recommendation
2
Classic and Adaptive machines
The image above roughly explains how
Machine Learning works.
3
4
• Let us say we have a dataset that contains pictures
of different kinds of fruits and we want Machine
Learning to segregate the photos based on the
kind of fruits.
• First we provide the dataset to the system i.e we
provide the input data.
• The system goes through the entire dataset or analyses
it to find patterns based on size, shapes, colors, etc.
• Now that it has figured out the patterns, the systems
takes decisions and starts separating the photos based
on the patterns.
• Once the work is done, the system learns from the
feedback it gets. If it gets any of the fruit type wrong, it
will make sure it does not happen in the future.
•
5
Here is how Machine Learning will work for us in this
case:
6
APPLICATIONS OF MACHINE LEARNING
• Google Search
• Stock Predictions
• Robotics-‘Sophia’ introduced which could
actually behave like humans.
• Social Media Services- Face Recognition , Add
as friend in facebook or people you may know
• Email Spam and Malware Filtering- C 4.5
Decision Tree Induction
• Over 325, 000 malwares are detected everyday and
each piece of code is 90–98% similar to its previous
versions. 7
Generic Representation of a Classical System that
receives some input values, processes them, and
produces output results:
•
8
• Machine learning algorithms are described as
learning a target function (f) that best maps
input variables (X) to an output variable (Y).
Y = f(X)
• This is a general learning task where we would like to make
predictions in the future (Y) given new examples of input
variables (X).
• It is harder than you think. There is also error (e) that is
independent of the input data (X).
Y = f(X) + e
• This error might be error such as not having enough attributes
to sufficiently characterize the best mapping from X to Y. This
error is called irreducible error because no matter how good
we get at estimating the target function (f), we cannot reduce
this error. 9
Schematic Representation of an adaptive system:
10
Adaptive Learning- Spam filtering, Natural
Language Processing, visual tracking with a
webcam or a smartphone, and predictive
analysis are only a few applications that
revolutionized human-machine interaction and
increased our expectations.
• Such a system isn't based on static or permanent
structures (model parameters and architectures)
but rather on a continuous ability to adapt its
behavior to external signals (datasets or real-time
inputs) and, like a human being, to predict the
future using uncertain and fragmentary pieces of
information.
11
Machine Learning Matters
• Machine learning is to study, engineer, and improve
mathematical models which can be trained (once or
continuously) with context-related data (provided by a
generic environment), to infer the future and to make
decisions without complete knowledge of all influencing
elements (external factors).
• In other words, an agent (which is a software entity that
receives information from an environment, picks the
best action to reach a specific goal, and observes the
results of it) adopts a statistical learning approach,
trying to determine the right probability distributions
and use them to compute the action (value or decision)
that is most likely to be successful (with the least error).
12
Machine learning is a sort of
modern magic.
• Prediction- Even in the most complex
scenarios, such as image classification with
convolutional neural networks, every piece of
information (geometry, color, peculiar
features, contrast, and so on) is already
present in the data and the model has to be
flexible enough to extract and learn it
permanently.
13
Supervised Learning
• Supervised learning is where you have input
variables (x) and an output variable (Y) and
you use an algorithm to learn the mapping
function from the input to the output.
Y = f(X)
• The goal is to approximate the mapping
function so well that when you have new
input data (x) that you can predict the output
variables (Y) for that data.
14
• It is called supervised learning because the
process of an algorithm learning from the
training dataset can be thought of as a teacher
supervising the learning process.
• We know the correct answers, the algorithm
iteratively makes predictions on the training
data and is corrected by the teacher.
• Learning stops when the algorithm achieves
an acceptable level of performance.
15
• Supervised learning problems can be further grouped into
regression and classification problems.
• Classification: A classification problem is when the output
variable is a category, such as “red” or “blue” or “disease” and
“no disease”.
• Regression: A regression problem is when the output variable is a
real value, such as “dollars” or “weight”.
• Some common types of problems built on top
of classification and regression include recommendation and time
series prediction respectively.
• Some popular examples of supervised machine learning
algorithms are:
• Linear regression for regression problems.
• Random forest for classification and regression problems.
• Support vector machines for classification problems.
16
Classification example
• Sometimes, instead of predicting the actual category, it's
better to determine its probability distribution.
• For example, an algorithm can be trained to recognize a
handwritten alphabetical letter, so its output is
categorical (in English, there'll be 26 allowed symbols).
• On the other hand, even for human beings, such a
process can lead to more than one probable outcome
when the visual representation of a letter isn't clear
enough to belong to a single category.
• That means that the actual output is better described by
a discrete probability distribution (for example, with 26
continuous values normalized so that they always sum up
to 1).
17
Problem with Supervised learning
• overfitting, which causes an overlearning due
to an excessive capacity.
• ability to predict correctly only the
samples used for training, while the
error for the remaining ones is
always very high
18
Common Supervised Learning
Applications include:
• Predictive analysis based on regression or
categorical classification Spam detection
• Pattern detection
• Natural Language Processing
• Sentiment analysis
• Automatic image classification
• Automatic sequence processing (for
example, music or speech)
19
20
Unsupervised Machine Learning
• Unsupervised learning is where you only have
input data (X) and no corresponding output
variables.
• The goal for unsupervised learning is to model
the underlying structure or distribution in the
data in order to learn more about the data.
• These are called unsupervised learning because
unlike supervised learning above there is no
correct answers and there is no teacher.
Algorithms are left to their own devises to
discover and present the interesting structure
in the data. 21
• Unsupervised learning problems can be further
grouped into clustering and association problems.
• Clustering: A clustering problem is where you want to
discover the inherent groupings in the data, such as
grouping customers by purchasing behavior.
• Association: An association rule learning problem is
where you want to discover rules that describe large
portions of your data, such as people that buy X also
tend to buy Y.
• Some popular examples of unsupervised learning
algorithms are:
• k-means for clustering problems.
• Apriori algorithm for association rule learning
problems. 22
Commons Unsupervised Applications
include:
• Object segmentation (for example, users,
products, movies, songs, and so on) Similarity
detection
• Automatic labeling
23
24
Semi-Supervised Machine Learning
• Problems where you have a large amount of input
data (X) and only some of the data is labeled (Y) are
called semi-supervised learning problems.
• These problems sit in between both supervised and
unsupervised learning.
• A good example is a photo archive where only some of
the images are labeled, (e.g. dog, cat, person) and the
majority are unlabeled.
• Many real world machine learning problems fall into
this area. This is because it can be expensive or time-
consuming to label data as it may require access to
domain experts. Whereas unlabeled data is cheap and
easy to collect and store. 25
Summary
• Supervised: All data is labeled and the
algorithms learn to predict the output
from the input data.
• Unsupervised: All data is unlabeled and
the algorithms learn to inherent structure
from the input data.
• Semi-supervised: Some data is labeled
but most of it is unlabeled and a mixture
of supervised and unsupervised
techniques can be used. 26
Reinforcement learning
• Reinforcement learning is also based on
feedback provided by the environment.
However, in this case, the information is more
qualitative and doesn't help the agent in
determining a precise measure of its error.
• this feedback is usually called reward
(sometimes, a negative one is defined as a
penalty) and it's useful to understand whether
a certain action performed in a state is
positive or not.
27
an action can also be imperfect, but in terms of a global
policy it has to offer the highest total reward.
• Reinforcement Learning is a framework for learning where
an agent interacts with an environment and receives a reward
for each interaction. The goal is to learn to accumulate as
much reward as possible over time.
• The real advantage these systems have over conventional supervised
learning is illustrated by this example I like a lot:
• Supervised Learning: Let us say that you know how to play chess. We
record you playing games against a lot of people. Now we train a system in
the supervised fashion to learn from your examples and call it KidPlayer.
Let us say that we train another system on Vishwanathan Anand’s games
and call this ProPlayer. Obviously the “policy” learned by KidPlayer will
be an inferior player to the policy learned by ProPlayer because of the
different capabilities of the teacher.
• Reinforcement Learning: In this setting, you make an agent play Chess
against someone (usually against another copy of itself) and give it a
reward for every time it wins a game.
28
• to learn the best policy for playing Atari video games and to
teach an agent how to associate the right action with an
input representing the state (usually a screenshot or a
memory dump).
• In the following figure, there's a schematic representation of a
deep neural network trained to play a famous Atari game.
• As input, there are one or more subsequent screenshots (this
can often be enough to capture the temporal dynamics as
well).
• They are processed using different layers (discussed briefly
later) to produce an output that represents the policy for a
specific state transition.
• After applying this policy, the game produces a feedback (as a
reward-penalty), and this result is used to refine the output
until it becomes stable (so the states are correctly recognized
and the suggested action is always the best one) and the total
reward overcomes a predefined threshold. 29
Atari Video Game
30
schematic representation of a deep neural
network trained to play a famous Atari game.
31
32
AI vs ML vs Deep Learning
33
Machine Learning vs Deep Learning
34
Deep Learning
• Machine learning concerned with algorithms inspired by the
structure and function of the brain called artificial neural
networks.
• A deep neural network (DNN) is an artificial neural Network
(ANN) with multiple hidden layers between the input and
output layers.
• DNNs are typically feedforward networks in which data flows
from the input layer to the output layer without looping back.
• where as feedforward with Backpropagation ( is a common
method for training a neural network in back direction with
change of weights according to error) forms a better DNN.
• Recurrent neural networks (RNNs), in which data can flow in
any direction, are used for applications for language purposes.
• Convolutional neural networks (CNNs) are used in computer
vision. CNNs have been applied to acoustic modeling for
automatic speech recognition
35
Beyond Machine Learning - Deep
Learning and bio-inspired adaptive
systems
• many researchers started training bigger
and bigger models, built with several
different layers (that's why this approach is
called deep learning), to solve new
challenging problems.
• The availability of cheap and fast computers
allowed them to get results in acceptable
timeframes and to use very large datasets
(made up of images, texts, and animations).
36
• The idea behind these techniques is to create
algorithms that work like a brain
• neurosciences and cognitive psychology.
• In particular, there's a growing interest in
pattern recognition and associative memories
whose structure and functioning are similar to
what happens in the neocortex. Such an
approach also allows simpler algorithms called
model- free
• It is based on generic learning techniques and
repeating experiences.
• testing different architectures and
optimization algorithms is quite simpler 37
Common Deep learning applications include:
• Image classification, Real-time visual tracking
• Autonomous car driving , Logistic optimization
• Bioinformatics, Speech recognition
38
39

Unit-V Machine Learning.ppt

  • 1.
  • 2.
    ML is aTechnology … ML Case studies : a) Predicting house prices b) Sentiment Analysis c) Document Retrieval d) Product Recommendation e) Visual Product Recommendation 2
  • 3.
    Classic and Adaptivemachines The image above roughly explains how Machine Learning works. 3
  • 4.
  • 5.
    • Let ussay we have a dataset that contains pictures of different kinds of fruits and we want Machine Learning to segregate the photos based on the kind of fruits. • First we provide the dataset to the system i.e we provide the input data. • The system goes through the entire dataset or analyses it to find patterns based on size, shapes, colors, etc. • Now that it has figured out the patterns, the systems takes decisions and starts separating the photos based on the patterns. • Once the work is done, the system learns from the feedback it gets. If it gets any of the fruit type wrong, it will make sure it does not happen in the future. • 5
  • 6.
    Here is howMachine Learning will work for us in this case: 6
  • 7.
    APPLICATIONS OF MACHINELEARNING • Google Search • Stock Predictions • Robotics-‘Sophia’ introduced which could actually behave like humans. • Social Media Services- Face Recognition , Add as friend in facebook or people you may know • Email Spam and Malware Filtering- C 4.5 Decision Tree Induction • Over 325, 000 malwares are detected everyday and each piece of code is 90–98% similar to its previous versions. 7
  • 8.
    Generic Representation ofa Classical System that receives some input values, processes them, and produces output results: • 8
  • 9.
    • Machine learningalgorithms are described as learning a target function (f) that best maps input variables (X) to an output variable (Y). Y = f(X) • This is a general learning task where we would like to make predictions in the future (Y) given new examples of input variables (X). • It is harder than you think. There is also error (e) that is independent of the input data (X). Y = f(X) + e • This error might be error such as not having enough attributes to sufficiently characterize the best mapping from X to Y. This error is called irreducible error because no matter how good we get at estimating the target function (f), we cannot reduce this error. 9
  • 10.
    Schematic Representation ofan adaptive system: 10
  • 11.
    Adaptive Learning- Spamfiltering, Natural Language Processing, visual tracking with a webcam or a smartphone, and predictive analysis are only a few applications that revolutionized human-machine interaction and increased our expectations. • Such a system isn't based on static or permanent structures (model parameters and architectures) but rather on a continuous ability to adapt its behavior to external signals (datasets or real-time inputs) and, like a human being, to predict the future using uncertain and fragmentary pieces of information. 11
  • 12.
    Machine Learning Matters •Machine learning is to study, engineer, and improve mathematical models which can be trained (once or continuously) with context-related data (provided by a generic environment), to infer the future and to make decisions without complete knowledge of all influencing elements (external factors). • In other words, an agent (which is a software entity that receives information from an environment, picks the best action to reach a specific goal, and observes the results of it) adopts a statistical learning approach, trying to determine the right probability distributions and use them to compute the action (value or decision) that is most likely to be successful (with the least error). 12
  • 13.
    Machine learning isa sort of modern magic. • Prediction- Even in the most complex scenarios, such as image classification with convolutional neural networks, every piece of information (geometry, color, peculiar features, contrast, and so on) is already present in the data and the model has to be flexible enough to extract and learn it permanently. 13
  • 14.
    Supervised Learning • Supervisedlearning is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output. Y = f(X) • The goal is to approximate the mapping function so well that when you have new input data (x) that you can predict the output variables (Y) for that data. 14
  • 15.
    • It iscalled supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process. • We know the correct answers, the algorithm iteratively makes predictions on the training data and is corrected by the teacher. • Learning stops when the algorithm achieves an acceptable level of performance. 15
  • 16.
    • Supervised learningproblems can be further grouped into regression and classification problems. • Classification: A classification problem is when the output variable is a category, such as “red” or “blue” or “disease” and “no disease”. • Regression: A regression problem is when the output variable is a real value, such as “dollars” or “weight”. • Some common types of problems built on top of classification and regression include recommendation and time series prediction respectively. • Some popular examples of supervised machine learning algorithms are: • Linear regression for regression problems. • Random forest for classification and regression problems. • Support vector machines for classification problems. 16
  • 17.
    Classification example • Sometimes,instead of predicting the actual category, it's better to determine its probability distribution. • For example, an algorithm can be trained to recognize a handwritten alphabetical letter, so its output is categorical (in English, there'll be 26 allowed symbols). • On the other hand, even for human beings, such a process can lead to more than one probable outcome when the visual representation of a letter isn't clear enough to belong to a single category. • That means that the actual output is better described by a discrete probability distribution (for example, with 26 continuous values normalized so that they always sum up to 1). 17
  • 18.
    Problem with Supervisedlearning • overfitting, which causes an overlearning due to an excessive capacity. • ability to predict correctly only the samples used for training, while the error for the remaining ones is always very high 18
  • 19.
    Common Supervised Learning Applicationsinclude: • Predictive analysis based on regression or categorical classification Spam detection • Pattern detection • Natural Language Processing • Sentiment analysis • Automatic image classification • Automatic sequence processing (for example, music or speech) 19
  • 20.
  • 21.
    Unsupervised Machine Learning •Unsupervised learning is where you only have input data (X) and no corresponding output variables. • The goal for unsupervised learning is to model the underlying structure or distribution in the data in order to learn more about the data. • These are called unsupervised learning because unlike supervised learning above there is no correct answers and there is no teacher. Algorithms are left to their own devises to discover and present the interesting structure in the data. 21
  • 22.
    • Unsupervised learningproblems can be further grouped into clustering and association problems. • Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior. • Association: An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y. • Some popular examples of unsupervised learning algorithms are: • k-means for clustering problems. • Apriori algorithm for association rule learning problems. 22
  • 23.
    Commons Unsupervised Applications include: •Object segmentation (for example, users, products, movies, songs, and so on) Similarity detection • Automatic labeling 23
  • 24.
  • 25.
    Semi-Supervised Machine Learning •Problems where you have a large amount of input data (X) and only some of the data is labeled (Y) are called semi-supervised learning problems. • These problems sit in between both supervised and unsupervised learning. • A good example is a photo archive where only some of the images are labeled, (e.g. dog, cat, person) and the majority are unlabeled. • Many real world machine learning problems fall into this area. This is because it can be expensive or time- consuming to label data as it may require access to domain experts. Whereas unlabeled data is cheap and easy to collect and store. 25
  • 26.
    Summary • Supervised: Alldata is labeled and the algorithms learn to predict the output from the input data. • Unsupervised: All data is unlabeled and the algorithms learn to inherent structure from the input data. • Semi-supervised: Some data is labeled but most of it is unlabeled and a mixture of supervised and unsupervised techniques can be used. 26
  • 27.
    Reinforcement learning • Reinforcementlearning is also based on feedback provided by the environment. However, in this case, the information is more qualitative and doesn't help the agent in determining a precise measure of its error. • this feedback is usually called reward (sometimes, a negative one is defined as a penalty) and it's useful to understand whether a certain action performed in a state is positive or not. 27
  • 28.
    an action canalso be imperfect, but in terms of a global policy it has to offer the highest total reward. • Reinforcement Learning is a framework for learning where an agent interacts with an environment and receives a reward for each interaction. The goal is to learn to accumulate as much reward as possible over time. • The real advantage these systems have over conventional supervised learning is illustrated by this example I like a lot: • Supervised Learning: Let us say that you know how to play chess. We record you playing games against a lot of people. Now we train a system in the supervised fashion to learn from your examples and call it KidPlayer. Let us say that we train another system on Vishwanathan Anand’s games and call this ProPlayer. Obviously the “policy” learned by KidPlayer will be an inferior player to the policy learned by ProPlayer because of the different capabilities of the teacher. • Reinforcement Learning: In this setting, you make an agent play Chess against someone (usually against another copy of itself) and give it a reward for every time it wins a game. 28
  • 29.
    • to learnthe best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state (usually a screenshot or a memory dump). • In the following figure, there's a schematic representation of a deep neural network trained to play a famous Atari game. • As input, there are one or more subsequent screenshots (this can often be enough to capture the temporal dynamics as well). • They are processed using different layers (discussed briefly later) to produce an output that represents the policy for a specific state transition. • After applying this policy, the game produces a feedback (as a reward-penalty), and this result is used to refine the output until it becomes stable (so the states are correctly recognized and the suggested action is always the best one) and the total reward overcomes a predefined threshold. 29
  • 30.
  • 31.
    schematic representation ofa deep neural network trained to play a famous Atari game. 31
  • 32.
  • 33.
    AI vs MLvs Deep Learning 33
  • 34.
    Machine Learning vsDeep Learning 34
  • 35.
    Deep Learning • Machinelearning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks. • A deep neural network (DNN) is an artificial neural Network (ANN) with multiple hidden layers between the input and output layers. • DNNs are typically feedforward networks in which data flows from the input layer to the output layer without looping back. • where as feedforward with Backpropagation ( is a common method for training a neural network in back direction with change of weights according to error) forms a better DNN. • Recurrent neural networks (RNNs), in which data can flow in any direction, are used for applications for language purposes. • Convolutional neural networks (CNNs) are used in computer vision. CNNs have been applied to acoustic modeling for automatic speech recognition 35
  • 36.
    Beyond Machine Learning- Deep Learning and bio-inspired adaptive systems • many researchers started training bigger and bigger models, built with several different layers (that's why this approach is called deep learning), to solve new challenging problems. • The availability of cheap and fast computers allowed them to get results in acceptable timeframes and to use very large datasets (made up of images, texts, and animations). 36
  • 37.
    • The ideabehind these techniques is to create algorithms that work like a brain • neurosciences and cognitive psychology. • In particular, there's a growing interest in pattern recognition and associative memories whose structure and functioning are similar to what happens in the neocortex. Such an approach also allows simpler algorithms called model- free • It is based on generic learning techniques and repeating experiences. • testing different architectures and optimization algorithms is quite simpler 37
  • 38.
    Common Deep learningapplications include: • Image classification, Real-time visual tracking • Autonomous car driving , Logistic optimization • Bioinformatics, Speech recognition 38
  • 39.