NPTL Machine Learning Syllabus

NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Machine Learning
Syllabus
Week 1: Introduction: Statistical Decision Theory - Regression,
Classification, Bias Variance.
Week 2: Linear Regression, Multivariate Regression, Subset Selection,
Shrinkage Methods, Principal Component Regression, Partial Least
squares.
Week 3: Linear Classification, Logistic Regression, Linear Discriminant
Analysis.
Week 4: Perceptron, Support Vector Machines.
Week 5: Neural Networks Introduction, Early Models, Perceptron
Learning, Backpropagation, Initialization, Training & Validation, Parameter
Estimation - MLE, MAP, Bayesian Estimation.
Week 6: Decision Trees, Regression Trees, Stopping Criterion & Pruning
loss functions, Categorical Attributes, Multiway Splits, Missing Values,
Decision Trees - Instability Evaluation Measures.
Week 7: Bootstrapping & Cross Validation, Class Evaluation Measures,
ROC curve, MDL, Ensemble Methods - Bagging, Committee Machines and
Stacking, Boosting
Week 8: Gradient Boosting, Random Forests, Multi-class Classification,
Naive Bayes, Bayesian Networks
Week 9: Undirected Graphical Models, HMM, Variable, Elimination, Belief
Propagation.
Week 10: Partitional Clustering, Hierarchical Clustering, Birch Algorithm,
CURE Algorithm, Density-based Clustering.
Week 11: Gaussian Mixture Models, Expectation Maximization.
Week 12: Learning Theory, Introduction to Reinforcement Learning,
Optional videos (RL framework, TD learning. Solution Methods,
Applications).

Important Questions
Week 1
• What is machine learning?
• Supervised Learning
• Unsupervised learning
• States to implement supervised learning
• Classification, overfitting, bias-variance trade-off
• Common techniques to avoid overfitting
Week 2
• Linear Regression
• Principal Component Analysis (PCA)
• How to reduce dimensionality using PCA
• Selecting the principal components
Week 3
• Linear classification
• Logistic regression
• Support Vector Machines
Week 4
• Binary and multi-class classification contexts
• Comparison of algorithms: SVM, perceptron, logistic regression
• Advantages, disadvantages, and scenarios for using each algorithm
Week 5
• Neural Network and activation functions
• Perceptron algorithm and how it works
• Backpropagation algorithm
• Hyperparameters and their selection

Week 6
• Decision Tree
• Pruning Techniques
• Regression tree
• Importance of stopping criteria
Week 7
• ROC Curve
• Cross Validation and K-fold method
• Ensemble Methods
• Bagging and boosting techniques
Week 8
• Random Forest
• Naive Bayes and Bayesian Networks
Week 12
• Reinforcement learning
• Comparison of Supervised and Reinforcement Learning
• Features and Applications of Reinforcement Learning
• Comparison of Reinforcement Learning and Deep Learning

Week 1
1. What is Machine Learning?
Ans: Machine learning is the process of using algorithms and statistical
models to allow a computer system to improve its performance on a
specific task by learning from data, without being explicitly programmed.
It involves training a model on a dataset and then using that model to
make predictions on new data.
Classification of Machine Learning
• Supervised learning
• Unsupervised learning
• Reinforcement learning
Application of Machine Learning
• Image Recognition: Image recognition is one of the most common
applications of machine learning. It is used to identify objects, persons,
places, digital images, etc. The popular use case of image recognition
and face detection.

• Speech Recognition: While using Google, we get an option of "Search
by voice," it comes under speech recognition, and it's a popular
application of machine learning.
• Traffic prediction: If we want to visit a new place, we take help of
Google Maps, which shows us the correct path with the shortest route
and predicts the traffic conditions.
• Self-driving cars: One of the most exciting applications of machine
learning is self-driving cars.
2. Supervised Learning
Ans: Supervised learning is a type of machine learning in which the model
is trained using labelled data, where the outcome or response variable is
known. The goal of SL is to create a model that can accurately predict the
output for new input data.
Types of supervised Machine learning Algorithms

Regression: Regression algorithms are used if there is a relationship
between the input variable and the output variable. It is used for the
prediction of continuous variables, such as Weather forecasting, Market
Trends, etc.
• Linear Regression
• Non-Linear Regression
• Regression Trees
• Bayesian Linear Regression
• Polynomial Regression
Classification: Classification algorithms are used when the output
variable is categorical, which means there are two classes such as Yes-No,
Male-Female, True-false, etc.
• Random Forest
• Decision Trees
• Logistic Regression
• Support vector Machines
Advantages of Supervised learning
• With the help of supervised learning, the model can predict the output
on the basis of prior experiences.
• In supervised learning, we can have an exact idea about the classes of
objects.
• Supervised learning model helps us to solve various real-world
problems such as fraud detection, spam filtering, etc.
Disadvantages of Supervised learning
• Supervised learning models are not suitable for handling the complex
tasks.
• Supervised learning cannot predict the correct output if the test data is
different from the training dataset.
• Training required lots of computation times.

3. Unsupervised Learning
Ans: Unsupervised learning is a type of machine learning in which the
model is trained using unlabelled data, where the outcome or response
variable is not known. The goal of unsupervised learning is to find patterns
or structure in the data.
Types of Unsupervised Learning Algorithm
Clustering: Clustering is a method of grouping the objects into clusters
such that objects with most similarities remains into a group and has less
or no similarities with the objects of another group.
Association: An association rule is an unsupervised learning method
which is used for finding the relationships between variables in the large
database.

4. States to implement Supervised Learning
Ans:
• Collect labelled data: Gather data with labelled examples of the input-
output relationship.
• Pre-process the data: Clean, format, and prepare the data for training
the model.
• Features selection: Choose the most important features from the data
to reduce dimensionality.
• Model selection: Choose an appropriate model that can learn the
input-output relationship from the data.
• Hyperparameter tuning: Adjust the model's hyperparameters to
optimize performance.
• Train the model: Feed the data into the model and adjust its
parameters to minimize the difference between predicted and actual
output.
• Evaluate the model's performance: Use various metrics to measure
the model's ability to generalize to new data.
5. Overfitting, Underfitting, bias-variance trade-off.
Ans:
Overfitting
• Overfitting occurs when our machine learning model tries to cover all
the data points or more than the required data points present in the
given dataset.
• Because of this, the model starts caching noise and inaccurate values
present in the dataset, and all these factors reduce the efficiency and
accuracy of the model.
• The overfitted model has low bias and high variance.
• The chances of occurrence of overfitting increase as much we provide
training to our model. It means the more we train our model, the more
chances of occurring the overfitted model.

Underfitting
• Underfitting occurs when our machine learning model is not able to
capture the underlying trend of the data.
• To avoid the overfitting in the model, the fed of training data can be
stopped at an early stage, due to which the model may not learn
enough from the training data.
• As a result, it may fail to find the best fit of the dominant trend in the
data.
• In the case of underfitting, the model is not able to learn enough from
the training data, and hence it reduces the accuracy and produces
unreliable predictions.
• An underfitted model has high bias and low variance.

Bias-Variance Trade Off
Reducible errors: These errors can be reduced to improve the model
accuracy. Such errors can further be classified into bias and Variance.
Irreducible errors: These errors will always be present in the model
Bias: While making predictions, a difference occurs between prediction
values made by the model and actual values/expected values, and this
difference is known as bias errors or Errors due to bias.
Variance: The variability of model prediction for a given data point which
tells us spread of our data is called the variance of the model.

Bias-Variance Trade-Off: While building the machine learning model, it
is really important to take care of bias and variance in order to avoid
overfitting and underfitting in the model. If the model is very simple with
fewer parameters, it may have low variance and high bias. Whereas, if the
model has a large number of parameters, it will have high variance and
low bias. So, it is required to make a balance between bias and variance
errors, and this balance between the bias error and variance error is known
as the Bias-Variance trade-off.
6. Common techniques to avoid overfitting
Ans:
• Early stopping: Early stopping pauses the training phase before the
machine learning model learns the noise in the data.
• Regularization: Regularization is a collection of training/optimization
techniques that seek to reduce overfitting. These methods try to
eliminate those factors that do not impact the prediction outcomes by
grading features based on importance.
• Cross-validation: Cross-validation is a technique used to estimate the
generalization error of a model. It involves splitting the data into
multiple subsets, where each subset is used for training and testing the
model.

• Data augmentation: Data augmentation is a technique used to
increase the size of the training set by generating new, synthetic data
from the existing data.
• Dropout: Dropout is a regularization technique that randomly drops
out some of the nodes in a neural network during training.
• Ensemble learning: Ensemble learning is a technique used to combine
multiple models to improve performance and reduce overfitting.

Week 2
1. Linear Regression
• Ans: Linear regression is one of the easiest and most popular Machine
Learning algorithms.
• It is a statistical method that is used for predictive analysis.
• Linear regression makes predictions for continuous/real or numeric
variables such as sales, salary, age, product price, etc.
• Linear regression algorithm shows a linear relationship between a
dependent (y) and one or more independent (x) variables, hence called as
linear regression.
• Since linear regression shows the linear relationship, which means it finds
how the value of the dependent variable is changing according to the
value of the independent variable.
Equation
y= a0+a1x + ε
• Y = Dependent Variable (Target Variable)
• X = Independent Variable (predictor Variable)
• a0 = intercept of the line (Gives an additional degree of freedom)
• a1 = Linear regression coefficient (scale factor to each input value).
• ε = error

Types of Linear Regression
Linear regression can be further divided into two types of the algorithm:
Simple Linear Regression: If a single independent variable is used to
predict the value of a numerical dependent variable, then such a Linear
Regression algorithm is called Simple Linear Regression.
Multiple Linear regression: If more than one independent variable is
used to predict the value of a numerical dependent variable, then such a
Linear Regression algorithm is called Multiple Linear Regression.
2. Principal Component Analysis (PCA)
Ans:
• Principal Component Analysis is an unsupervised learning algorithm that
is used for the dimensionality reduction in machine learning.
• It is a statistical process that converts the observations of correlated
features into a set of linearly uncorrelated features with the help of
orthogonal transformation. These new transformed features are called
the Principal Components.
• It is one of the popular tools that is used for exploratory data analysis
and predictive modelling.
• PCA generally tries to find the lower-dimensional surface to project the
high-dimensional data.
• Some real-world applications of PCA are image processing, movie
recommendation system, optimizing the power allocation in various
communication channels.
3. How to reduce dimensionality using PCA
Ans:
1. Import necessary libraries: All the necessary libraries required to load
the dataset.
2. Load the dataset: After importing all the necessary libraries, we need
to load the dataset.
3. Standardize the features: Before applying PCA or any other Machine
Learning technique it is always considered good practice to standardize
the data.

4. Applying Principal Component Analysis: We will apply PCA on the
scaled dataset.
5. Checking Co-relation between features after PCA.
4. Selecting the principal components
Ans:
• Scree plot
• Kaiser criterion
• Proportion of explained variance

Week 3
1. Linear classification
Ans: Linear classification is a type of classification algorithm used in
machine learning that separates data points into two or more classes using
a linear boundary or hyperplane.
Commonly used Linear classification algorithms
Logistic regression: A probabilistic linear classification algorithm that
models the probability of a data point belonging to each class using a
logistic function.
Linear discriminant analysis (LDA): A statistical linear classification
algorithm that models the distribution of the input features for each class
using a multivariate normal distribution and finds the boundary that
maximizes the separation between the classes.
Support vector machines (SVM): A linear classification algorithm that
finds the hyperplane that maximizes the margin between the two classes.
Perceptron Algorithm: It is a type of single-layer neural network that
takes in input data and produces an output based on a set of weights and
biases.
2. Logistic regression
Ans: Logistic regression is one of the most popular Machine
Learning algorithms, which comes under the Supervised
Learning technique.
It is a statistical model used to analyse the relationship between
a categorical dependent variable and one or more independent
variables. It is used to predict the probability of the occurrence
of an event by fitting data to a logistic function.
In logistic regression, the dependent variable is binary, meaning
it can take only two values, usually represented as 0 or 1. The

independent variables can be either categorical or continuous.
The goal is to find a relationship between the independent
variables and the probability of the dependent variable being 1.
Logistic Regression is much similar to the Linear Regression
except that how they are used. Linear Regression is used for
solving Regression problems, whereas Logistic regression is used
for solving the classification problems.
3. Support vector machines (SVM)
Ans: Support Vector Machine or SVM is one of the most popular
Supervised Learning algorithms, which is used for Classification as well as
Regression problems. However, primarily, it is used for Classification
problems in Machine Learning.
The goal of the SVM algorithm is to create the best line or decision
boundary that can segregate n-dimensional space into classes so that we
can easily put the new data point in the correct category in the future. This
best decision boundary is called a hyperplane.
Types of SVM
• Linear SVM
• Non-Linear SVM

Week 4
1. Binary and multi-class classification.
Ans: Binary Classification: It is a process or task of classification, in
which a given data is being classified into two classes. It’s basically a kind
of prediction about which of two groups the thing belongs to.
Here are two discrete classes, one is spam and the other is primary. So,
this is a problem of binary classification.
Multi-Class Classification: Multi-class classification is the task of
classifying elements into different classes. Unlike binary, it doesn’t restrict
itself to any number of classes.
Examples of multi-class classification are
• classification of news in different categories,
• classifying books according to the subject,
• classifying students according to their streams etc.
2. Comparison of algorithms: SVM, linear perceptron,
logistic regression
Ans:
SVM: SVM is a powerful and flexible algorithm for binary classification and
also works well for multi-class classification tasks. It tries to find the
hyperplane that maximally separates the different classes in the data.
Perceptron: Perceptron is a simple and fast algorithm for binary
classification tasks. It works by finding a linear decision boundary that
separates the two classes.
Logistic Regression: Logistic Regression is a probabilistic algorithm for
binary classification tasks. It models the probability of an instance
belonging to a particular class using a logistic function.

SVM Binary and Multi-class Both No
Perceptron Binary Linear No
Logistic Regression Binary and Multi-class Linear Yes
3. Advantages, disadvantages, and scenarios for using each
algorithm
Ans:
SVM
Advantage: Powerful and flexible algorithm for handling high-
dimensional feature spaces and non-linearly separable data.
Disadvantage: Can be computationally expensive for large datasets and
sensitive to the choice of kernel function.
Perceptron
Advantage: Simple and fast algorithm for linearly separable data.
Disadvantage: Limited in its ability to handle non-linearly separable data
and prone to overfitting with large number of features.

Logistic Regression
Advantage: Probabilistic algorithm that model’s probability of instance
belonging to particular class, works well for linearly separable data, and
less prone to overfitting.
Disadvantage: May not work well for non-linearly separable data and
assumes linear relationship between input features and output variable.

Week 5
1. Neural Network and Activation functions.
Ans:
Neural Network
Neural networks are an information processing paradigm inspired by the
human nervous system. Just like in the human nervous system, we have
biological neurons in the same way in neural networks we have artificial
neurons, artificial neurons are mathematical functions derived from
biological neurons.
A neural network is a type of artificial intelligence (AI) algorithm modeled
after the structure and function of the human brain. It consists of
interconnected nodes or artificial neurons that process and transmit
information through a series of layers. The input data is passed through
the network, and the neurons compute and modify the data based on the
strength of their connections, producing an output.
Neural networks are used in a variety of applications, including image and
speech recognition, natural language processing, etc.
Elements of a Neural Network
Input Layer: This layer accepts input features. It provides information
from the outside world to the network, no computation is performed at
this layer, nodes here just pass on the information(features) to the hidden
layer.
Hidden Layer: Nodes of this layer are not exposed to the outer world;
they are part of the abstraction provided by any neural network. The
hidden layer performs all sorts of computation on the features entered
through the input layer and transfers the result to the output layer.
Output Layer: his layer brings up the information learned by the network
to the outer world.

Activation Function
Activation functions are mathematical functions that are applied to the
output of each node in a neural network to introduce non-linearity into
the model. These functions allow neural networks to learn complex
patterns and relationships in data.
2. Perceptron algorithm and how it works
Ans:
The Perceptron algorithm is a linear classification algorithm used for
binary classification tasks. It is a type of supervised learning algorithm that
learns a decision boundary to separate the input data into two classes. The
decision boundary is represented as a hyperplane that separates the input
data into two regions.
Working
• Initialize the weights and bias: The Perceptron algorithm starts by
initializing the weights and bias to small random values.

• Iterate over the training data: For each input data point, the
Perceptron algorithm computes the weighted sum of the inputs
multiplied by their respective weights, and adds the bias term. This is
the net input value.
• Apply the activation function: The net input value is then passed
through an activation function (usually a step function) to produce the
output value. If the output value is greater than or equal to 0, the
Perceptron predicts the positive class; otherwise, it predicts the
negative class.
• Update the weights and bias: If the Perceptron makes an incorrect
prediction, the weights and bias are updated to move the decision
boundary closer to the correct classification. Specifically, the weights
are updated by adding or subtracting the product of the input value
and the learning rate (a small positive value that controls the
magnitude of the weight update), multiplied by the error (the
difference between the predicted and true classes). The bias is updated
in a similar way, but without the input value.
• Repeat steps 2-4: The Perceptron algorithm iterates over the training
data multiple times (epochs) until the decision boundary converges
and no further weight updates are required.
Weight: Weight parameter represents the strength of the connection
between units. This is one of most important parameter of Perceptron
components.

3. Backpropagation algorithm
Ans:
Backpropagation is one of the important concepts of a neural network.
The Backpropagation algorithm is used for training artificial neural
networks. It is a supervised learning algorithm that can be used for both
classification and regression tasks.
Working
• First, we initialize the weights and biases of the neural network to small
random values.
• Then, we feed an input data point into the network and compute the
output using the current weights and biases.
• Next, we calculate the error between the predicted output and the true
output.
• The error is then propagated backwards through the network to update
the weights and biases, using a technique called gradient descent.
• Gradient descent involves calculating the gradient of the error with
respect to each weight and bias, and then updating them in the
opposite direction.
• This process is repeated for multiple input data points, in batches or
one at a time, until the error is minimized and the network can
accurately predict the output for new input data.

4. Hyperparameters and their selection
Ans: Hyperparameters are parameters that are not learned by the model
during training but are set by the user before training begins. They control
the behaviour of the learning algorithm and affect the performance of the
model.

Week 6
1. Decision trees
Ans: Decision Tree is a Supervised learning technique that can be used
for both classification and Regression problems, but mostly it is preferred
for solving Classification problems. It is a tree-structured classifier, where
internal nodes represent the features of a dataset, branches represent the
decision rules and each leaf node represents the outcome.
In a Decision tree, there are two nodes, which are the Decision Node and
Leaf Node. Decision nodes are used to make any decision and have
multiple branches, whereas Leaf nodes are the output of those decisions
and do not contain any further branches.
It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.
Root Node: Root node is from where the decision tree starts. It represents
the entire dataset, which further gets divided into two or more
homogeneous sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.

Splitting: Splitting is the process of dividing the decision node/root node
into sub-nodes according to the given conditions.
Example:
2. Pruning Techniques
Ans: Pruning is a process of deleting the unnecessary nodes from a tree
in order to get the optimal decision tree.
A too-large tree increases the risk of overfitting, and a small tree may not
capture all the important features of the dataset. Therefore, a technique
that decreases the size of the learning tree without reducing accuracy is
known as Pruning.
Types pruning technology
• Cost Complexity Pruning
• Reduced Error Pruning.

3. Regression trees
Ans: A regression tree is basically a decision tree that is used for the task
of regression which can be used to predict continuous valued outputs
instead of discrete outputs.
Advantages of regression trees
• Visualization of data becomes easier as users can identify and process
each and every step.
• A specific decision node could be set to have a priority against other
decision nodes.
• As the regression tree progresses, undesired data will be filtered at
each step. As a result, only important data is left to process, which
increases the efficiency and accuracy of our design.
• It is easy to prepare regression trees – they can be used to present data
during meetings, presentations, etc.
4. Importance of stopping criteria
Ans: Stopping criteria are used in decision trees to determine when to
stop the recursive process of splitting the data into smaller subsets. The
stopping criteria help prevent overfitting, where the model becomes too
complex and starts to fit the noise in the data instead of the underlying
pattern.
commonly used stopping criteria in decision trees
• Maximum depth: This stopping criterion sets a limit on the maximum
depth of the tree. Once a tree reaches this depth, no more splits are
allowed, and the node becomes a leaf node.
• Minimum number of samples: This stopping criterion specifies a
minimum number of samples required to split a node. If a node has
fewer samples than this threshold, it is not split, and it becomes a leaf
node.
• Maximum number of leaf nodes: This stopping criterion sets a limit
on the maximum number of leaf nodes that a tree can have. Once the
tree has reached this limit, no more splits are allowed, and the tree is
pruned to remove unnecessary branches.

Week 7
1. ROC curve.
Ans: ROC (Receiver Operating Characteristic) curve is a graphical
plot that illustrates the performance of a binary classifier at
different classification thresholds. The curve is created by
plotting the true positive rate (TPR) against the false positive rate
(FPR) at various threshold settings.
The true positive rate (TPR) is also called sensitivity and is defined
as the fraction of positive instances that are correctly identified
by the classifier. The false positive rate (FPR) is defined as the
fraction of negative instances that are incorrectly classified as
positive by the classifier.
2. Cross-Validation and K-fold method
Ans: Cross-validation is a technique for validating the model efficiency
by training it on the subset of input data and testing on previously unseen
subset of the input data. We can also say that it is a technique to check
how a statistical model generalizes to an independent dataset.
The basic idea of cross-validation is to split the data into two or more
subsets, where one subset is used for training the model, and the other
subsets are used for testing the model. The training subset is used to fit
the model, and the testing subset is used to evaluate the performance of
the model on new, unseen data. This process is repeated multiple times
with different subsets, and the performance metrics are averaged to
obtain an estimate of the generalization performance.
K-Fold Cross-Validation
K-fold cross-validation approach divides the input dataset into K groups
of samples of equal sizes. These samples are called folds. The model is
trained on k-1 folds and tested on the remaining fold. This process is
repeated k times with each fold serving as the testing set once. The

performance metrics are averaged over the k iterations to obtain an
estimate of the generalization performance.
The steps for k-fold cross-validation
• Split the input dataset into K groups
• For each group:
o Take one group as the reserve or test data set.
o Use remaining groups as the training dataset
o Fit the model on the training set and evaluate the performance of
the model using the test set.
3. Ensemble Methods.
Ans: Ensemble methods in machine learning are techniques that combine
multiple individual models to create a more accurate and robust model.
Ensemble methods are based on the idea that combining multiple models
can often lead to better performance than using a single model.
4. Bagging and Boosting
Ans: Bagging, also known as Bootstrap aggregating, is an ensemble
learning technique that helps to improve the performance and accuracy
of machine learning algorithms. It is used to deal with bias-variance trade-
offs and reduces the variance of a prediction model. Bagging avoids
overfitting of data and is used for both regression and classification
models, specifically for decision tree algorithms.

Boosting is a machine learning technique for building ensemble models,
in which multiple weak learners are combined to form a strong learner. A
weak learner is a model that performs only slightly better than random
guessing. Boosting improves the accuracy of a weak learner by iteratively
training new models on the training data, and adjusting the weights of the
training data.
Types of Boosting
• Adaptive boosting
• Gradient Boosting
• Extreme gradient boosting

Week 8
1. Random Forest
Ans: Random Forest is a popular machine learning algorithm that
belongs to the supervised learning technique. It can be used for both
Classification and Regression problems in ML. It is based on the concept
of ensemble learning. It is an ensemble learning method that combines
multiple decision trees to form a stronger model.
Random Forest has several advantages over other machine learning
algorithms. It can handle a large number of input features and can work
well even when some of the features are irrelevant or redundant. It is also
relatively insensitive to overfitting, since the predictions of the individual
trees are combined to form a more robust overall prediction.

2. Naive Bayes and Bayesian Networks
Ans:
Bayesian Networks
Bayesian networks, also known as Bayes networks or belief networks, are
a type of probabilistic graphical model used in machine learning, statistics,
and artificial intelligence. They are used to represent and reason about
uncertainty and probability in complex systems.
Bayesian networks can be used for a variety of tasks in machine learning,
including classification, regression, anomaly detection, and decision
making under uncertainty. They have several advantages over other
machine learning algorithms, such as the ability to handle missing data
and the ability to incorporate domain knowledge into the model.
Naïve Bayes
• Naïve Bayes algorithm is a supervised learning algorithm, which is
based on Bayes theorem and used for solving classification problems.
• It is mainly used in text classification that includes a high-dimensional
training dataset.
• Naïve Bayes Classifier is one of the simple and most effective
Classification algorithms which helps in building the fast machine
learning models that can make quick predictions.

Week 12
1. Introduction to reinforcement learning
Ans: Reinforcement Learning is a feedback-based Machine learning
technique in which an AI agent (A software component) automatically
explore its surrounding by hitting & trail, taking action, learning from
experiences, and improving its performance. Agent gets rewarded for
each good action and get punished for each bad action; hence the goal
of reinforcement learning agent is to maximize the rewards.
In Reinforcement Learning, the agent learns automatically using
feedbacks without any labelled data, unlike supervised learning.
Agent: An entity that can perceive/explore the environment and act upon
it.
Environment: A situation in which an agent is present or surrounded by.
Action: Actions are the moves taken by an agent within the environment.
State: State is a situation returned by the environment after each action
taken by the agent.

Reward: A feedback returned to the agent from the environment to
evaluate the action of the agent.
Advantages and Disadvantage
Advantages
• Flexibility: Reinforcement learning can be used in a variety of problem
domains, including robotics, games, and finance. It can also handle
problems with continuous state and action spaces.
• Adaptability: RL agents can learn from experience and adjust their
behaviour to changing environments.
• Autonomy: Once an RL agent has been trained, it can operate
autonomously, without the need for constant supervision.
• Optimal decision-making: Reinforcement learning can find the
optimal decision-making policy for a given task, which can lead to
better outcomes than human-designed policies.
Disadvantages
• Requires a large amount of data to learn an effective policy.
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload of states
which can weaken the results.
2. Features and applications of reinforcement learning
Ans:
Features
• In RL, the agent is not instructed about the environment and what
actions need to be taken.
• It is based on the hit and trial process.
• The agent takes the next action and changes states according to the
feedback of the previous action.
• The agent may get a delayed reward.

• The environment is very complex, and the agent needs to explore it to
reach to get the maximum positive rewards.
Application
Robotics: Reinforcement learning has been applied to train robots to
perform tasks such as object manipulation, navigation, and grasping.
For example, RL has been used to teach robots to play table tennis,
fold towels, and open doors.
Game Playing: Reinforcement learning has been successfully applied
to train agents that can play games like Chess, Free Fire, PubG, etc.
Natural Language Processing: Reinforcement learning is being used
to improve natural language processing tasks such as machine
translation, question-answering, and chatbots. RL helps the agent
learn how to generate better responses based on the feedback it
receives.
Traffic light control: Traffic light control is another example where RL
can be used to optimize traffic flow and reduce congestion. RL
algorithms can learn from historical data and optimize the timing of
traffic signals to minimize waiting times, reduce traffic congestion,
and improve traffic flow. By learning from experience, traffic light
control systems can adjust signal timings dynamically based on traffic
volumes and congestion levels.
Driverless cars: Driverless cars use reinforcement learning to learn
how to navigate through traffic and make decisions based on the
environment they encounter. RL algorithms help the car learn from
past experiences, such as recognizing different objects on the road,
adapting to changing road conditions, and making decisions based
on the feedback it receives. Reinforcement learning is a critical
component in the development of autonomous vehicles.

3. Comparison of reinforcement learning and deep learning
Ans:
Reinforcement Learning (RL) Deep Learning (DL)
Objective
Learn to take actions to maximize a
reward signal
Learn to generalize patterns in
data
Feedback
Received in the form of rewards or
penalties based on actions taken in
an environment
Received as error between
predicted and actual outputs
Training
Requires trial-and-error approach
and feedback mechanism to guide
learning
Trained using input data and
backpropagation to minimize
error
Use Cases
Robotics, game playing, autonomous
systems
Image recognition, speech
recognition, natural language
processing, data analysis
Data
Requirements
Requires fewer input data but more
feedback data
Requires large amounts of input
data
Computational
Resources
Requires less computational
resources compared to DL
Requires more computational
resources compared to RL

4. Comparison of supervised and reinforcement learning
Ans:
Reinforcement Learning Supervised Learning
RL works by interacting with the
environment.
Supervised learning works on the existing
dataset.
The RL algorithm works like the human
brain works when making some decisions.
Supervised Learning works as when a
human learns things in the supervision of a
guide.
There is no labelled dataset is present The labelled dataset is present.
No previous training is provided to the
learning agent.
Training is provided to the algorithm so
that it can predict the output.
RL helps to take decisions sequentially. In Supervised learning, decisions are made
when input is given.

NPTL Machine Learning Syllabus

Recommended

Recommended

More Related Content

Similar to NPTL Machine Learning Syllabus

Similar to NPTL Machine Learning Syllabus (20)

More from Mr. Moms

More from Mr. Moms (8)

Recently uploaded

Recently uploaded (20)

NPTL Machine Learning Syllabus