SlideShare a Scribd company logo
Machine Learning Methods
FOSS GURU
Objectives
 Let us look at some of the objectives under this
Techniques of Machine Learning tutorial.
 Explain unsupervised learning with examples
 Describe semi-supervised learning and reinforcement
learning
 Discuss supervised learning with examples
 Define some important models and techniques in
Machine Learning
How do Machines learn?
 There are various methods to do that. Which method to
follow completely depends on the problem statement.
Depending on the dataset, and our problem, there are
two different ways to go deeper. One is supervised
learning and the other is unsupervised learning. The
following chart explains the further classification of
machine learning methods. We will discuss them one by
one.
Machine Learning
Methods
Supervised
Learning
Classification
Regression
Unsupervise
d Learning
Clustering
Association
Other ML
Learning
Dimensionality
Reduction
Ensemble
Methods
Neural Nets
and Deep
Learning
Transfer
Learning
Natural
Language
Processing
Word
Embeddings
What is Supervised Learning?
 Supervised Learning is a type of Machine Learning used
to learn models from labeled training data. It allows us to
predict the output for future or unseen data.
Understanding the Algorithm of
Supervised Learning
The image above explains the relationship
between input and output data of
Supervised Learning.
Supervised Learning Flow
 Let’s look at the steps of Supervised Learning flow:
 Data Preparation
 Training Step
 Evaluation or Test Step
 Production Deployment
Testing the Algorithm
 Given below are the steps for testing the algorithm of Supervised
Learning.
 Once the algorithm is trained, test it with test data (a set of data
instances that do not appear in the training set).
 A well-trained algorithm can predict well for new test data.
 If the learning is poor, we have an underfit situation. The algorithm
will not work well on test data. Retraining may be needed to find a
better fit.
 If learning on training data is too intensive, it may lead to overfitting
– a situation where the algorithm is not able to handle new testing
data that it has not seen before. The technique to keep data generic
is called regularization.
Examples of Supervised Learning
 Example 1: Voice Assistants like Apple Siri, Amazon Alexa,
Microsoft Cortana, and Google Assistant are trained to
understand human speech and intent. Based on human
interactions, these chatbots take appropriate action.
 Example 2: Gmail filters a new email into Inbox (normal)
or Junk folder (Spam) based on past information about
what you consider spam.
 Example 3: The predictions made by weather apps at a
given time are based on some prior knowledge and
analysis of how the weather has been over a period of
time for a particular place.
Types of Supervised Learning
Given below are 2 types of Supervised Learning.
 Classification
 Regression
Classification
 When the output variable is categorical like two or more
classes we make the use of classification. Here the answer is
set like true/false and yes or no. The output comes based on
the category like black or white, male or female and fit or
unfit.
 Classification is a problem that is used to predict which class
a data point is part of which is usually a discrete value. From
the example I gave above, predicting whether a person is
likely to default on a loan or not is an example of a
classification problem since the classes we want to predict
are discrete: “likely to pay a loan” and “not likely to pay a
loan”.
Classification: predicting a
class/label
 Classification is used to predict a discrete class or
label(Y). Classification basically involves assigning new
input variables (X) to the class to which they most likely
belong in based on a classification model that was built
from the training data that was already labeled. Labeled
data is used to train a classifier so that the algorithm
performs well on data that does not have a label(not yet
labeled). Repeating this process of training a classifier on
already labeled data is known as “learning”.
 Some of the questions that a classification model helps to
answer are:
 Is this a picture of a cat or a dog?
 Is this email Spam or not?
 Is it going to rain or not?
 Is this borrower going to repay their loan?
 Is this post negative or positive?
 What is the genre of this song/movie?
 Which type of gene is this?
 Classification is again divided into three other categories
or problems which are: Binary classification, Multi-
class/Multinomial classification and Multi-label
classification.
Binary classification
 This is a task of classifying the elements/input variables
of a given set into two groups i.e predicting which of the
two groups each variable belongs to. Problems like
predicting whether a picture is of a cat or dog or
predicting whether an email is Spam or not are Binary
classification problems.
Multi-class/Multinomial
classification
 This is the task of classifying elements/ input variables
into one of three or more classes/groups. Contrary to
binary classification where elements are classified into
one of two classes. Some use cases of this type of
classification can be: classifying news into different
categories(sports/entertainment/political), sentiment
analysis;classifying text into either positive negative or
neutral, segmenting customers for marketing
purposes etc.
 Note that sentiment analysis can either be a binary
classification or a multi-class classification depending on
the number of classes you want to be used to classify text
elements. In binary, one would predict whether a
statement is “negative” or “positive”, while in multi-class,
one would have other classes to predict such as sadness,
happiness, fear/surprise and anger/disgust.
Multi-label classification
 This classification problem can be easily confused with
the multi-class classification but they have a distinct
difference. Multi-label is a generalization of multi-class
which is a single-label problem of categorizing instances
into precisely one of more than two classes. In this case,
we have more than one discrete classes.
Classification Algorithms
There are various classification algorithms that are used to make predictions such as:
 Neural Networks — Has various use cases. An example is in Computer Vision which is done through convolutional neural
networks(CNN). You can read more on how Google classifies people and places using Computer Vision together with other use cases
on a post on Introduction to Computer Vision that my boyfriend wrote.
 K-NN — K-Nearest Neighbors is often used in search applications where you are looking for “similar” items. One of the biggest use
cases of K-NN search is in the development of Recommender Systems.
 Decision Trees — Decision trees are used in both regression and classification problems. A decision tree can be used to visually and
explicitly represent decisions and decision making. They can be used to assess the characteristics of a client that leads to the purchase
of a new product in a direct marketing campaign.
 Random Forests — Random Forest algorithms can also be used in both regression and classification problems. It builds multiple
decision trees and merges them together to get a more accurate and stable prediction. It can be used in a number of circumstances
including image classification, recommendation engines, feature selection, etc.
 Support Vector Machines(SVM) — This is a fundamental data science algorithm which can be used for both regression or classification
problems. However, it is mostly used in classification problems. It has a plethora of use cases such as face detection, handwriting
recognition and classification of images just to mention a few.
 Naive Bayes — This is a simple and easy to implement algorithm. A classical use case for Naive Bayes is document classification where
it determines whether a given text document corresponds to one or more categories. It can be used in classifying whether an email is
Spam or not Spam or to classify a news article about technology, politics or sports. I’ve also previously done sentiment analysis using
Naive Bayes. You can find the notes and code here.
Regression
 The relationship between two or more variables associated
with each other for changing the value of another variable.
For example, when you ask for a salary it depends on your
working experience. The height weight chart according to
age can be an example of regression machine learning.
 Regression is a problem that is used to predict continuous
quantity output. A continuous output variable is a real-value,
such as an integer or floating point value. For example,
where classification has been used to determine whether or
not it will rain tomorrow, a regression algorithm will be used
to predict the amount of rainfall.
Types of Regression
 Simple Linear Regression
 Polynomial Regression
 Support Vector Regression
 Decision Tree Regression
 Random Forest Regression
Simple Linear Regression
 This is one of the most common and interesting type of Regression
technique. Here we predict a target variable Y based on the input
variable X. A linear relationship should exist between target variable
and predictor and so comes the name Linear Regression.
 Consider predicting the salary of an employee based on his/her
age. We can easily identify that there seems to be a correlation
between employee’s age and salary (more the age more is the
salary). The hypothesis of linear regression is
 Y=a+bX
 Y represents salary, X is employee’s age and a and b are the
coefficients of equation. So in order to predict Y (salary) given X
(age), we need to know the values of a and b (the model’s
coefficients).
 In polynomial regression, we transform the original
features into polynomial features of a given degree and
then apply Linear Regression on it. Consider the above
linear model Y = a+bX is transformed to something like
Polynomial Regression
Support Vector Regression
 In SVR, we identify a hyperplane with maximum margin
such that maximum number of data points are within
that margin. SVRs are almost similar to SVM
classification algorithm.
Decision Tree Regression
 Decision trees can be used for classification as well as
regression. In decision trees, at each level we need to
identify the splitting attribute. In case of regression, the
ID3 algorithm can be used to identify the splitting node
by reducing standard deviation (in classification
information gain is used).
Random Forest Regression
 Random forest is an ensemble approach where we take into
account the predictions of several decision regression trees.
 Select K random points
 Identify n where n is the number of decision tree regressors to be
created. Repeat step 1 and 2 to create several regression trees.
 The average of each branch is assigned to leaf node in each
decision tree.
 To predict output for a variable, the average of all the predictions of
all decision trees are taken into consideration.
 Random Forest prevents overfitting (which is common in decision
trees) by creating random subsets of the features and building
smaller trees using these subsets.
Classification Supervised Learning
Let us look at the classifications of
Supervised learning.
 Answers “What class?”
 Applied when the output has finite and
discrete values
Example: Social media sentiment analysis has three potential outcomes,
positive, negative, or neutral.
Example: Given the age and salary of consumers, predict whether they
will be interested in purchasing a house. You can perform this in your
lab environment with the dataset available in the LMS.
Regression Supervised Learning
Given below are some elements of Regression Supervised learning.
 Answers “How much?”
 Applied when the output is a continuous number
 A simple regression algorithm: y = wx + b. Example:
the relationship between environmental
temperature (y) and humidity levels (x)
Example
Given the details of the area a house is located, predict the prices. You can
perform this in your lab environment with the dataset available in the LMS.
Unsupervised Learning: Case
Study
 Ever wondered how NASA discovers a new heavenly body
and identifies that it is different from a previously known
astronomical object? It has no knowledge of these new
bodies but classifies them into proper categories.
 NASA uses unsupervised learning to create clusters of
heavenly bodies, with each cluster containing objects of a
similar nature. Unsupervised Learning is a subset of
Machine Learning used to extract inferences from
datasets that consist of input data without labeled
responses.
Types of Unsupervised Learning
The 3 types of Unsupervised Learning are:
 Clustering
 Visualization Algorithms
 Anomaly Detection
The most common unsupervised learning method is cluster
analysis. It is used to find data clusters so that each cluster
has the most closely matched data.
Clustering
Example: An online news portal segments articles into various categories like
Business, Technology, Sports, etc.
Visualization Algorithms
 Visualization algorithms are unsupervised learning algorithms that accept
unlabeled data and display this data in an intuitive 2D or 3D format. The
data is separated into somewhat clear clusters to aid understanding.
 In the figure, the animals are rather well separated from vehicles. Horses are
close to deer but far from birds, and so on.
Anomaly Detection
 This algorithm detects anomalies in data without any
prior training. It can detect suspicious credit card
transactions and differentiate a criminal from a set of
people.
What is Semi-Supervised
Learning?
 It is a hybrid approach (combination of Supervised and
Unsupervised Learning) with some labeled and some
non-labeled data.
Example of Semi-Supervised Learning
 Google Photos automatically detects the same person in
multiple photos from a vacation trip (clustering –
unsupervised). One has to just name the person once
(supervised), and the name tag gets attached to that
person in all the photos.
What is Reinforcement Learning?
 Reinforcement Learning is a type of Machine Learning
that allows the learning system to observe the
environment and learn the ideal behavior based on trying
to maximize some notion of cumulative reward.
Features of Reinforcement Learning
Some of the features of Reinforcement Learning are
mentioned below.
 The learning system (agent) observes the environment,
selects and takes certain actions, and gets rewards in
return (or penalties in certain cases).
 The agent learns the strategy or policy (choice of actions)
that maximizes its rewards over time.
Reinforcement Learning
Example of Reinforcement Learning
 In a manufacturing unit, a robot uses deep reinforcement
learning to identify a device from one box and put it in a
container. The robot learns this by means of a rewards-
based learning system, which incentivizes it for the right
action.
Other Machine Learning
 Dimensionality Reduction
 Ensemble Methods
 Neural Nets and Deep Learning
 Transfer Learning
 Natural Language Processing
 Word Embeddings
Dimensionality reduction
 Dimensionality reduction can be considered as
compression of a file. It means, taking out the
information which is not relevant. It reduces the
complexity of data and tries to keep the meaningful
data. For example, in image compression, we reduce
the dimensionality of the space in which the image stays
as it is without destroying too much of the meaningful
content in the image.
Ensemble Methods
 Imagine you’ve decided to build a bicycle because you
are not feeling happy with the options available in stores
and online. You might begin by finding the best of each
part you need. Once you assemble all these great parts,
the resulting bike will outshine all the other options.
 Ensemble methods use this same idea of combining
several predictive models (supervised ML) to get higher
quality predictions than each of the models could provide
on its own. For example, the Random Forest algorithms is
an ensemble method that combines many Decision Trees
trained with different samples of the data sets. As a result,
the quality of the predictions of a Random Forest is
higher than the quality of the predictions estimated with
a single Decision Tree.
 Think of ensemble methods as a way to reduce the
variance and bias of a single machine learning model.
That’s important because any given model may be
accurate under certain conditions but inaccurate under
other conditions. With another model, the relative
accuracy might be reversed. By combining the two
models, the quality of the predictions is balanced out.
 The great majority of top winners of Kaggle competitions
use ensemble methods of some kind. The most popular
ensemble algorithms are Random
Forest, XGBoost and LightGBM.
Neural Nets and Deep Learning
 In contrast to linear and logistic regressions which are
considered linear models, the objective of neural
networks is to capture non-linear patterns in data by
adding layers of parameters to the model. In the image
below, the simple neural net has three inputs, a single
hidden layer with five parameters, and an output layer.
 In fact, the structure of neural networks is flexible enough
to build our well-known linear and logistic regression.
The term Deep learning comes from a neural net with
many hidden layers (see next Figure) and encapsulates a
wide variety of architectures.
 It’s especially difficult to keep up with developments in
deep learning, in part because the research and industry
communities have doubled down on their deep learning
efforts, spawning whole new methodologies every day.
 For the best performance, deep learning techniques
require a lot of data — and a lot of compute power since
the method is self-tuning many parameters within huge
architectures. It quickly becomes clear why deep learning
practitioners need very powerful computers enhanced
with GPUs (graphical processing units).
 In particular, deep learning techniques have been
extremely successful in the areas of vision (image
classification), text, audio and video. The most common
software packages for deep learning
are Tensorflow and PyTorch.
Transfer Learning
 Let’s pretend that you’re a data scientist working in the
retail industry. You’ve spent months training a high-
quality model to classify images as shirts, t-shirts and
polos. Your new task is to build a similar model to classify
images of dresses as jeans, cargo, casual, and dress pants.
Can you transfer the knowledge built into the first model
and apply it to the second model? Yes, you can, using
Transfer Learning.
 Transfer Learning refers to re-using part of a previously
trained neural net and adapting it to a new but similar
task. Specifically, once you train a neural net using data
for a task, you can transfer a fraction of the trained layers
and combine them with a few new layers that you can
train using the data of the new task. By adding a few
layers, the new neural net can learn and adapt quickly to
the new task.
 The main advantage of transfer learning is that you need
less data to train the neural net, which is particularly
important because training for deep learning algorithms
is expensive in terms of both time and money
(computational resources) — and of course it’s often very
difficult to find enough labeled data for the training.
 Let’s return to our example and assume that for the shirt
model you use a neural net with 20 hidden layers. After
running a few experiments, you realize that you can
transfer 18 of the shirt model layers and combine them
with one new layer of parameters to train on the images
of pants. The pants model would therefore have 19
hidden layers. The inputs and outputs of the two tasks are
different but the re-usable layers may be summarizing
information that is relevant to both, for example aspects
of cloth.
 Transfer learning has become more and more popular
and there are now many solid pre-trained models
available for common deep learning tasks like image and
text classification.
Natural Language Processing
 A huge percentage of the world’s data and knowledge is
in some form of human language. Can you imagine being
able to read and comprehend thousands of books,
articles and blogs in seconds? Obviously, computers can’t
yet fully understand human text but we can train them to
do certain tasks. For example, we can train our phones to
autocomplete our text messages or to correct misspelled
words. We can even teach a machine to have a simple
conversation with a human.
 Natural Language Processing (NLP) is not a machine
learning method per se, but rather a widely used
technique to prepare text for machine learning. Think of
tons of text documents in a variety of formats (word,
online blogs, ….). Most of these text documents will be
full of typos, missing characters and other words that
needed to be filtered out. At the moment, the most
popular package for processing text is NLTK (Natural
Language ToolKit), created by researchers at Stanford.
 The simplest way to map text into a numerical representation
is to compute the frequency of each word within each text
document. Think of a matrix of integers where each row
represents a text document and each column represents a
word. This matrix representation of the word frequencies is
commonly called Term Frequency Matrix (TFM). From there,
we can create another popular matrix representation of a
text document by dividing each entry on the matrix by a
weight of how important each word is within the entire
corpus of documents. We call this method Term Frequency
Inverse Document Frequency (TFIDF) and it typically works
better for machine learning tasks.
Word Embeddings
TFM and TFIDF are numerical representations of text
documents that only consider frequency and weighted
frequencies to represent text documents. By contrast, word
embeddings can capture the context of a word in a
document. With the word context, embeddings can quantify
the similarity between words, which in turn allows us to do
arithmetic with words.
Word2Vec is a method based on neural nets that maps words in
a corpus to a numerical vector. We can then use these vectors to
find synonyms, perform arithmetic operations with words, or to
represent text documents (by taking the mean of all the word
vectors in a document). For example, let’s assume that we use a
sufficiently big corpus of text documents to estimate word
embeddings. Let’s also assume that the words king, queen, man
and woman are part of the corpus. Let say that vector(‘word’) is
the numerical vector that represents the word ‘word’. To
estimate vector(‘woman’), we can perform the arithmetic
operation with vectors:
vector(‘king’) + vector(‘woman’) — vector(‘man’) ~
vector(‘queen’)
Arithmetic with Word (Vectors) Embeddings.
Word representations allow finding similarities between words by
computing the cosine similarity between the vector representation of
two words. The cosine similarity measures the angle between two
vectors.
We compute word embeddings using machine learning methods, but
that’s often a pre-step to applying a machine learning algorithm on top.
For instance, suppose we have access to the tweets of several thousand
Twitter users. Also suppose that we know which of these Twitter users
bought a house. To predict the probability of a new Twitter user buying a
house, we can combine Word2Vec with a logistic regression.
You can train word embeddings yourself or get a pre-trained (transfer
learning) set of word vectors. To download pre-trained word vectors in
157 different languages, take a look at FastText.
Get The Text

More Related Content

What's hot

M08 BiasVarianceTradeoff
M08 BiasVarianceTradeoffM08 BiasVarianceTradeoff
M08 BiasVarianceTradeoff
Raman Kannan
 
Major presentation
Major presentationMajor presentation
Major presentation
PS241092
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
Sara Hooker
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Module 2: Machine Learning Deep Dive
Module 2:  Machine Learning Deep DiveModule 2:  Machine Learning Deep Dive
Module 2: Machine Learning Deep Dive
Sara Hooker
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
Luis Borbon
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
Sara Hooker
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Salah Amean
 
Q01741118123
Q01741118123Q01741118123
Q01741118123
IOSR Journals
 
Module 6: Ensemble Algorithms
Module 6:  Ensemble AlgorithmsModule 6:  Ensemble Algorithms
Module 6: Ensemble Algorithms
Sara Hooker
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Parth Khare
 
Module 7: Unsupervised Learning
Module 7:  Unsupervised LearningModule 7:  Unsupervised Learning
Module 7: Unsupervised Learning
Sara Hooker
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and Evaluation
Sara Hooker
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
Krish_ver2
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos butest
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET Journal
 
Decision tree
Decision treeDecision tree
Decision tree
ShraddhaPandey45
 

What's hot (19)

M08 BiasVarianceTradeoff
M08 BiasVarianceTradeoffM08 BiasVarianceTradeoff
M08 BiasVarianceTradeoff
 
Major presentation
Major presentationMajor presentation
Major presentation
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Module 2: Machine Learning Deep Dive
Module 2:  Machine Learning Deep DiveModule 2:  Machine Learning Deep Dive
Module 2: Machine Learning Deep Dive
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
Q01741118123
Q01741118123Q01741118123
Q01741118123
 
Module 6: Ensemble Algorithms
Module 6:  Ensemble AlgorithmsModule 6:  Ensemble Algorithms
Module 6: Ensemble Algorithms
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 
Module 7: Unsupervised Learning
Module 7:  Unsupervised LearningModule 7:  Unsupervised Learning
Module 7: Unsupervised Learning
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and Evaluation
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
 
Decision tree
Decision treeDecision tree
Decision tree
 

Similar to Machine learning Method and techniques

Industrial training ppt
Industrial training pptIndustrial training ppt
Industrial training ppt
HRJEETSINGH
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
Benjaminlapid1
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
Adetimehin Oluwasegun Matthew
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Adetimehin Oluwasegun Matthew
 
Machine Learning_PPT.pptx
Machine Learning_PPT.pptxMachine Learning_PPT.pptx
Machine Learning_PPT.pptx
RajeshBabu833061
 
An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine Learning
Vedaj Padman
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
PranavPatil822557
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
Rahul Jaiman
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)butest
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
iaeronlineexm
 
Case Study 2 SCADA WormProtecting the nation’s critical infra.docx
Case Study 2 SCADA WormProtecting the nation’s critical infra.docxCase Study 2 SCADA WormProtecting the nation’s critical infra.docx
Case Study 2 SCADA WormProtecting the nation’s critical infra.docx
wendolynhalbert
 
slides
slidesslides
slidesbutest
 
slides
slidesslides
slidesbutest
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
NitinSharma134320
 
INTRODUCTION TO MACHINE LEARNING.pptx
INTRODUCTION TO MACHINE LEARNING.pptxINTRODUCTION TO MACHINE LEARNING.pptx
INTRODUCTION TO MACHINE LEARNING.pptx
AbhigyanMishra17
 
Machine learning presentation (razi)
Machine learning presentation (razi)Machine learning presentation (razi)
Machine learning presentation (razi)
Rizwan Shaukat
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
Satyam Jaiswal
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptx
ssuser957b41
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Panimalar Engineering College
 

Similar to Machine learning Method and techniques (20)

Industrial training ppt
Industrial training pptIndustrial training ppt
Industrial training ppt
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Machine Learning_PPT.pptx
Machine Learning_PPT.pptxMachine Learning_PPT.pptx
Machine Learning_PPT.pptx
 
An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine Learning
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
Case Study 2 SCADA WormProtecting the nation’s critical infra.docx
Case Study 2 SCADA WormProtecting the nation’s critical infra.docxCase Study 2 SCADA WormProtecting the nation’s critical infra.docx
Case Study 2 SCADA WormProtecting the nation’s critical infra.docx
 
slides
slidesslides
slides
 
slides
slidesslides
slides
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
INTRODUCTION TO MACHINE LEARNING.pptx
INTRODUCTION TO MACHINE LEARNING.pptxINTRODUCTION TO MACHINE LEARNING.pptx
INTRODUCTION TO MACHINE LEARNING.pptx
 
Machine learning presentation (razi)
Machine learning presentation (razi)Machine learning presentation (razi)
Machine learning presentation (razi)
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptx
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Recently uploaded

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 

Recently uploaded (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Machine learning Method and techniques

  • 2. Objectives  Let us look at some of the objectives under this Techniques of Machine Learning tutorial.  Explain unsupervised learning with examples  Describe semi-supervised learning and reinforcement learning  Discuss supervised learning with examples  Define some important models and techniques in Machine Learning
  • 3. How do Machines learn?  There are various methods to do that. Which method to follow completely depends on the problem statement. Depending on the dataset, and our problem, there are two different ways to go deeper. One is supervised learning and the other is unsupervised learning. The following chart explains the further classification of machine learning methods. We will discuss them one by one.
  • 4. Machine Learning Methods Supervised Learning Classification Regression Unsupervise d Learning Clustering Association Other ML Learning Dimensionality Reduction Ensemble Methods Neural Nets and Deep Learning Transfer Learning Natural Language Processing Word Embeddings
  • 5. What is Supervised Learning?  Supervised Learning is a type of Machine Learning used to learn models from labeled training data. It allows us to predict the output for future or unseen data.
  • 6. Understanding the Algorithm of Supervised Learning The image above explains the relationship between input and output data of Supervised Learning.
  • 7. Supervised Learning Flow  Let’s look at the steps of Supervised Learning flow:  Data Preparation  Training Step  Evaluation or Test Step  Production Deployment
  • 8. Testing the Algorithm  Given below are the steps for testing the algorithm of Supervised Learning.  Once the algorithm is trained, test it with test data (a set of data instances that do not appear in the training set).  A well-trained algorithm can predict well for new test data.  If the learning is poor, we have an underfit situation. The algorithm will not work well on test data. Retraining may be needed to find a better fit.  If learning on training data is too intensive, it may lead to overfitting – a situation where the algorithm is not able to handle new testing data that it has not seen before. The technique to keep data generic is called regularization.
  • 9. Examples of Supervised Learning  Example 1: Voice Assistants like Apple Siri, Amazon Alexa, Microsoft Cortana, and Google Assistant are trained to understand human speech and intent. Based on human interactions, these chatbots take appropriate action.  Example 2: Gmail filters a new email into Inbox (normal) or Junk folder (Spam) based on past information about what you consider spam.  Example 3: The predictions made by weather apps at a given time are based on some prior knowledge and analysis of how the weather has been over a period of time for a particular place.
  • 10. Types of Supervised Learning Given below are 2 types of Supervised Learning.  Classification  Regression
  • 11. Classification  When the output variable is categorical like two or more classes we make the use of classification. Here the answer is set like true/false and yes or no. The output comes based on the category like black or white, male or female and fit or unfit.  Classification is a problem that is used to predict which class a data point is part of which is usually a discrete value. From the example I gave above, predicting whether a person is likely to default on a loan or not is an example of a classification problem since the classes we want to predict are discrete: “likely to pay a loan” and “not likely to pay a loan”.
  • 12. Classification: predicting a class/label  Classification is used to predict a discrete class or label(Y). Classification basically involves assigning new input variables (X) to the class to which they most likely belong in based on a classification model that was built from the training data that was already labeled. Labeled data is used to train a classifier so that the algorithm performs well on data that does not have a label(not yet labeled). Repeating this process of training a classifier on already labeled data is known as “learning”.
  • 13.  Some of the questions that a classification model helps to answer are:  Is this a picture of a cat or a dog?  Is this email Spam or not?  Is it going to rain or not?  Is this borrower going to repay their loan?  Is this post negative or positive?  What is the genre of this song/movie?  Which type of gene is this?
  • 14.  Classification is again divided into three other categories or problems which are: Binary classification, Multi- class/Multinomial classification and Multi-label classification.
  • 15. Binary classification  This is a task of classifying the elements/input variables of a given set into two groups i.e predicting which of the two groups each variable belongs to. Problems like predicting whether a picture is of a cat or dog or predicting whether an email is Spam or not are Binary classification problems.
  • 16. Multi-class/Multinomial classification  This is the task of classifying elements/ input variables into one of three or more classes/groups. Contrary to binary classification where elements are classified into one of two classes. Some use cases of this type of classification can be: classifying news into different categories(sports/entertainment/political), sentiment analysis;classifying text into either positive negative or neutral, segmenting customers for marketing purposes etc.
  • 17.  Note that sentiment analysis can either be a binary classification or a multi-class classification depending on the number of classes you want to be used to classify text elements. In binary, one would predict whether a statement is “negative” or “positive”, while in multi-class, one would have other classes to predict such as sadness, happiness, fear/surprise and anger/disgust.
  • 18. Multi-label classification  This classification problem can be easily confused with the multi-class classification but they have a distinct difference. Multi-label is a generalization of multi-class which is a single-label problem of categorizing instances into precisely one of more than two classes. In this case, we have more than one discrete classes.
  • 19. Classification Algorithms There are various classification algorithms that are used to make predictions such as:  Neural Networks — Has various use cases. An example is in Computer Vision which is done through convolutional neural networks(CNN). You can read more on how Google classifies people and places using Computer Vision together with other use cases on a post on Introduction to Computer Vision that my boyfriend wrote.  K-NN — K-Nearest Neighbors is often used in search applications where you are looking for “similar” items. One of the biggest use cases of K-NN search is in the development of Recommender Systems.  Decision Trees — Decision trees are used in both regression and classification problems. A decision tree can be used to visually and explicitly represent decisions and decision making. They can be used to assess the characteristics of a client that leads to the purchase of a new product in a direct marketing campaign.  Random Forests — Random Forest algorithms can also be used in both regression and classification problems. It builds multiple decision trees and merges them together to get a more accurate and stable prediction. It can be used in a number of circumstances including image classification, recommendation engines, feature selection, etc.  Support Vector Machines(SVM) — This is a fundamental data science algorithm which can be used for both regression or classification problems. However, it is mostly used in classification problems. It has a plethora of use cases such as face detection, handwriting recognition and classification of images just to mention a few.  Naive Bayes — This is a simple and easy to implement algorithm. A classical use case for Naive Bayes is document classification where it determines whether a given text document corresponds to one or more categories. It can be used in classifying whether an email is Spam or not Spam or to classify a news article about technology, politics or sports. I’ve also previously done sentiment analysis using Naive Bayes. You can find the notes and code here.
  • 20. Regression  The relationship between two or more variables associated with each other for changing the value of another variable. For example, when you ask for a salary it depends on your working experience. The height weight chart according to age can be an example of regression machine learning.  Regression is a problem that is used to predict continuous quantity output. A continuous output variable is a real-value, such as an integer or floating point value. For example, where classification has been used to determine whether or not it will rain tomorrow, a regression algorithm will be used to predict the amount of rainfall.
  • 21. Types of Regression  Simple Linear Regression  Polynomial Regression  Support Vector Regression  Decision Tree Regression  Random Forest Regression
  • 22. Simple Linear Regression  This is one of the most common and interesting type of Regression technique. Here we predict a target variable Y based on the input variable X. A linear relationship should exist between target variable and predictor and so comes the name Linear Regression.  Consider predicting the salary of an employee based on his/her age. We can easily identify that there seems to be a correlation between employee’s age and salary (more the age more is the salary). The hypothesis of linear regression is  Y=a+bX  Y represents salary, X is employee’s age and a and b are the coefficients of equation. So in order to predict Y (salary) given X (age), we need to know the values of a and b (the model’s coefficients).
  • 23.  In polynomial regression, we transform the original features into polynomial features of a given degree and then apply Linear Regression on it. Consider the above linear model Y = a+bX is transformed to something like Polynomial Regression
  • 24. Support Vector Regression  In SVR, we identify a hyperplane with maximum margin such that maximum number of data points are within that margin. SVRs are almost similar to SVM classification algorithm.
  • 25. Decision Tree Regression  Decision trees can be used for classification as well as regression. In decision trees, at each level we need to identify the splitting attribute. In case of regression, the ID3 algorithm can be used to identify the splitting node by reducing standard deviation (in classification information gain is used).
  • 26. Random Forest Regression  Random forest is an ensemble approach where we take into account the predictions of several decision regression trees.  Select K random points  Identify n where n is the number of decision tree regressors to be created. Repeat step 1 and 2 to create several regression trees.  The average of each branch is assigned to leaf node in each decision tree.  To predict output for a variable, the average of all the predictions of all decision trees are taken into consideration.  Random Forest prevents overfitting (which is common in decision trees) by creating random subsets of the features and building smaller trees using these subsets.
  • 27. Classification Supervised Learning Let us look at the classifications of Supervised learning.  Answers “What class?”  Applied when the output has finite and discrete values Example: Social media sentiment analysis has three potential outcomes, positive, negative, or neutral. Example: Given the age and salary of consumers, predict whether they will be interested in purchasing a house. You can perform this in your lab environment with the dataset available in the LMS.
  • 28. Regression Supervised Learning Given below are some elements of Regression Supervised learning.  Answers “How much?”  Applied when the output is a continuous number  A simple regression algorithm: y = wx + b. Example: the relationship between environmental temperature (y) and humidity levels (x) Example Given the details of the area a house is located, predict the prices. You can perform this in your lab environment with the dataset available in the LMS.
  • 29. Unsupervised Learning: Case Study  Ever wondered how NASA discovers a new heavenly body and identifies that it is different from a previously known astronomical object? It has no knowledge of these new bodies but classifies them into proper categories.  NASA uses unsupervised learning to create clusters of heavenly bodies, with each cluster containing objects of a similar nature. Unsupervised Learning is a subset of Machine Learning used to extract inferences from datasets that consist of input data without labeled responses.
  • 30. Types of Unsupervised Learning The 3 types of Unsupervised Learning are:  Clustering  Visualization Algorithms  Anomaly Detection The most common unsupervised learning method is cluster analysis. It is used to find data clusters so that each cluster has the most closely matched data.
  • 31. Clustering Example: An online news portal segments articles into various categories like Business, Technology, Sports, etc.
  • 32. Visualization Algorithms  Visualization algorithms are unsupervised learning algorithms that accept unlabeled data and display this data in an intuitive 2D or 3D format. The data is separated into somewhat clear clusters to aid understanding.  In the figure, the animals are rather well separated from vehicles. Horses are close to deer but far from birds, and so on.
  • 33. Anomaly Detection  This algorithm detects anomalies in data without any prior training. It can detect suspicious credit card transactions and differentiate a criminal from a set of people.
  • 34. What is Semi-Supervised Learning?  It is a hybrid approach (combination of Supervised and Unsupervised Learning) with some labeled and some non-labeled data.
  • 35. Example of Semi-Supervised Learning  Google Photos automatically detects the same person in multiple photos from a vacation trip (clustering – unsupervised). One has to just name the person once (supervised), and the name tag gets attached to that person in all the photos.
  • 36. What is Reinforcement Learning?  Reinforcement Learning is a type of Machine Learning that allows the learning system to observe the environment and learn the ideal behavior based on trying to maximize some notion of cumulative reward.
  • 37. Features of Reinforcement Learning Some of the features of Reinforcement Learning are mentioned below.  The learning system (agent) observes the environment, selects and takes certain actions, and gets rewards in return (or penalties in certain cases).  The agent learns the strategy or policy (choice of actions) that maximizes its rewards over time.
  • 39. Example of Reinforcement Learning  In a manufacturing unit, a robot uses deep reinforcement learning to identify a device from one box and put it in a container. The robot learns this by means of a rewards- based learning system, which incentivizes it for the right action.
  • 40. Other Machine Learning  Dimensionality Reduction  Ensemble Methods  Neural Nets and Deep Learning  Transfer Learning  Natural Language Processing  Word Embeddings
  • 41. Dimensionality reduction  Dimensionality reduction can be considered as compression of a file. It means, taking out the information which is not relevant. It reduces the complexity of data and tries to keep the meaningful data. For example, in image compression, we reduce the dimensionality of the space in which the image stays as it is without destroying too much of the meaningful content in the image.
  • 42. Ensemble Methods  Imagine you’ve decided to build a bicycle because you are not feeling happy with the options available in stores and online. You might begin by finding the best of each part you need. Once you assemble all these great parts, the resulting bike will outshine all the other options.
  • 43.  Ensemble methods use this same idea of combining several predictive models (supervised ML) to get higher quality predictions than each of the models could provide on its own. For example, the Random Forest algorithms is an ensemble method that combines many Decision Trees trained with different samples of the data sets. As a result, the quality of the predictions of a Random Forest is higher than the quality of the predictions estimated with a single Decision Tree.
  • 44.  Think of ensemble methods as a way to reduce the variance and bias of a single machine learning model. That’s important because any given model may be accurate under certain conditions but inaccurate under other conditions. With another model, the relative accuracy might be reversed. By combining the two models, the quality of the predictions is balanced out.  The great majority of top winners of Kaggle competitions use ensemble methods of some kind. The most popular ensemble algorithms are Random Forest, XGBoost and LightGBM.
  • 45. Neural Nets and Deep Learning  In contrast to linear and logistic regressions which are considered linear models, the objective of neural networks is to capture non-linear patterns in data by adding layers of parameters to the model. In the image below, the simple neural net has three inputs, a single hidden layer with five parameters, and an output layer.
  • 46.
  • 47.  In fact, the structure of neural networks is flexible enough to build our well-known linear and logistic regression. The term Deep learning comes from a neural net with many hidden layers (see next Figure) and encapsulates a wide variety of architectures.
  • 48.  It’s especially difficult to keep up with developments in deep learning, in part because the research and industry communities have doubled down on their deep learning efforts, spawning whole new methodologies every day.
  • 49.
  • 50.  For the best performance, deep learning techniques require a lot of data — and a lot of compute power since the method is self-tuning many parameters within huge architectures. It quickly becomes clear why deep learning practitioners need very powerful computers enhanced with GPUs (graphical processing units).  In particular, deep learning techniques have been extremely successful in the areas of vision (image classification), text, audio and video. The most common software packages for deep learning are Tensorflow and PyTorch.
  • 51. Transfer Learning  Let’s pretend that you’re a data scientist working in the retail industry. You’ve spent months training a high- quality model to classify images as shirts, t-shirts and polos. Your new task is to build a similar model to classify images of dresses as jeans, cargo, casual, and dress pants. Can you transfer the knowledge built into the first model and apply it to the second model? Yes, you can, using Transfer Learning.
  • 52.  Transfer Learning refers to re-using part of a previously trained neural net and adapting it to a new but similar task. Specifically, once you train a neural net using data for a task, you can transfer a fraction of the trained layers and combine them with a few new layers that you can train using the data of the new task. By adding a few layers, the new neural net can learn and adapt quickly to the new task.
  • 53.  The main advantage of transfer learning is that you need less data to train the neural net, which is particularly important because training for deep learning algorithms is expensive in terms of both time and money (computational resources) — and of course it’s often very difficult to find enough labeled data for the training.
  • 54.  Let’s return to our example and assume that for the shirt model you use a neural net with 20 hidden layers. After running a few experiments, you realize that you can transfer 18 of the shirt model layers and combine them with one new layer of parameters to train on the images of pants. The pants model would therefore have 19 hidden layers. The inputs and outputs of the two tasks are different but the re-usable layers may be summarizing information that is relevant to both, for example aspects of cloth.
  • 55.  Transfer learning has become more and more popular and there are now many solid pre-trained models available for common deep learning tasks like image and text classification.
  • 56. Natural Language Processing  A huge percentage of the world’s data and knowledge is in some form of human language. Can you imagine being able to read and comprehend thousands of books, articles and blogs in seconds? Obviously, computers can’t yet fully understand human text but we can train them to do certain tasks. For example, we can train our phones to autocomplete our text messages or to correct misspelled words. We can even teach a machine to have a simple conversation with a human.
  • 57.  Natural Language Processing (NLP) is not a machine learning method per se, but rather a widely used technique to prepare text for machine learning. Think of tons of text documents in a variety of formats (word, online blogs, ….). Most of these text documents will be full of typos, missing characters and other words that needed to be filtered out. At the moment, the most popular package for processing text is NLTK (Natural Language ToolKit), created by researchers at Stanford.
  • 58.  The simplest way to map text into a numerical representation is to compute the frequency of each word within each text document. Think of a matrix of integers where each row represents a text document and each column represents a word. This matrix representation of the word frequencies is commonly called Term Frequency Matrix (TFM). From there, we can create another popular matrix representation of a text document by dividing each entry on the matrix by a weight of how important each word is within the entire corpus of documents. We call this method Term Frequency Inverse Document Frequency (TFIDF) and it typically works better for machine learning tasks.
  • 59. Word Embeddings TFM and TFIDF are numerical representations of text documents that only consider frequency and weighted frequencies to represent text documents. By contrast, word embeddings can capture the context of a word in a document. With the word context, embeddings can quantify the similarity between words, which in turn allows us to do arithmetic with words.
  • 60. Word2Vec is a method based on neural nets that maps words in a corpus to a numerical vector. We can then use these vectors to find synonyms, perform arithmetic operations with words, or to represent text documents (by taking the mean of all the word vectors in a document). For example, let’s assume that we use a sufficiently big corpus of text documents to estimate word embeddings. Let’s also assume that the words king, queen, man and woman are part of the corpus. Let say that vector(‘word’) is the numerical vector that represents the word ‘word’. To estimate vector(‘woman’), we can perform the arithmetic operation with vectors: vector(‘king’) + vector(‘woman’) — vector(‘man’) ~ vector(‘queen’)
  • 61. Arithmetic with Word (Vectors) Embeddings. Word representations allow finding similarities between words by computing the cosine similarity between the vector representation of two words. The cosine similarity measures the angle between two vectors. We compute word embeddings using machine learning methods, but that’s often a pre-step to applying a machine learning algorithm on top. For instance, suppose we have access to the tweets of several thousand Twitter users. Also suppose that we know which of these Twitter users bought a house. To predict the probability of a new Twitter user buying a house, we can combine Word2Vec with a logistic regression. You can train word embeddings yourself or get a pre-trained (transfer learning) set of word vectors. To download pre-trained word vectors in 157 different languages, take a look at FastText.