SlideShare a Scribd company logo
1 of 38
Download to read offline
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Machine Learning
Syllabus
Week 1: Introduction: Statistical Decision Theory - Regression,
Classification, Bias Variance.
Week 2: Linear Regression, Multivariate Regression, Subset Selection,
Shrinkage Methods, Principal Component Regression, Partial Least
squares.
Week 3: Linear Classification, Logistic Regression, Linear Discriminant
Analysis.
Week 4: Perceptron, Support Vector Machines.
Week 5: Neural Networks Introduction, Early Models, Perceptron
Learning, Backpropagation, Initialization, Training & Validation, Parameter
Estimation - MLE, MAP, Bayesian Estimation.
Week 6: Decision Trees, Regression Trees, Stopping Criterion & Pruning
loss functions, Categorical Attributes, Multiway Splits, Missing Values,
Decision Trees - Instability Evaluation Measures.
Week 7: Bootstrapping & Cross Validation, Class Evaluation Measures,
ROC curve, MDL, Ensemble Methods - Bagging, Committee Machines and
Stacking, Boosting
Week 8: Gradient Boosting, Random Forests, Multi-class Classification,
Naive Bayes, Bayesian Networks
Week 9: Undirected Graphical Models, HMM, Variable, Elimination, Belief
Propagation.
Week 10: Partitional Clustering, Hierarchical Clustering, Birch Algorithm,
CURE Algorithm, Density-based Clustering.
Week 11: Gaussian Mixture Models, Expectation Maximization.
Week 12: Learning Theory, Introduction to Reinforcement Learning,
Optional videos (RL framework, TD learning. Solution Methods,
Applications).
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Important Questions
Week 1
• What is machine learning?
• Supervised Learning
• Unsupervised learning
• States to implement supervised learning
• Classification, overfitting, bias-variance trade-off
• Common techniques to avoid overfitting
Week 2
• Linear Regression
• Principal Component Analysis (PCA)
• How to reduce dimensionality using PCA
• Selecting the principal components
Week 3
• Linear classification
• Logistic regression
• Support Vector Machines
Week 4
• Binary and multi-class classification contexts
• Comparison of algorithms: SVM, perceptron, logistic regression
• Advantages, disadvantages, and scenarios for using each algorithm
Week 5
• Neural Network and activation functions
• Perceptron algorithm and how it works
• Backpropagation algorithm
• Hyperparameters and their selection
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 6
• Decision Tree
• Pruning Techniques
• Regression tree
• Importance of stopping criteria
Week 7
• ROC Curve
• Cross Validation and K-fold method
• Ensemble Methods
• Bagging and boosting techniques
Week 8
• Random Forest
• Naive Bayes and Bayesian Networks
Week 12
• Reinforcement learning
• Comparison of Supervised and Reinforcement Learning
• Features and Applications of Reinforcement Learning
• Comparison of Reinforcement Learning and Deep Learning
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 1
1. What is Machine Learning?
Ans: Machine learning is the process of using algorithms and statistical
models to allow a computer system to improve its performance on a
specific task by learning from data, without being explicitly programmed.
It involves training a model on a dataset and then using that model to
make predictions on new data.
Classification of Machine Learning
• Supervised learning
• Unsupervised learning
• Reinforcement learning
Application of Machine Learning
• Image Recognition: Image recognition is one of the most common
applications of machine learning. It is used to identify objects, persons,
places, digital images, etc. The popular use case of image recognition
and face detection.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
• Speech Recognition: While using Google, we get an option of "Search
by voice," it comes under speech recognition, and it's a popular
application of machine learning.
• Traffic prediction: If we want to visit a new place, we take help of
Google Maps, which shows us the correct path with the shortest route
and predicts the traffic conditions.
• Self-driving cars: One of the most exciting applications of machine
learning is self-driving cars.
2. Supervised Learning
Ans: Supervised learning is a type of machine learning in which the model
is trained using labelled data, where the outcome or response variable is
known. The goal of SL is to create a model that can accurately predict the
output for new input data.
Types of supervised Machine learning Algorithms
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Regression: Regression algorithms are used if there is a relationship
between the input variable and the output variable. It is used for the
prediction of continuous variables, such as Weather forecasting, Market
Trends, etc.
• Linear Regression
• Non-Linear Regression
• Regression Trees
• Bayesian Linear Regression
• Polynomial Regression
Classification: Classification algorithms are used when the output
variable is categorical, which means there are two classes such as Yes-No,
Male-Female, True-false, etc.
• Random Forest
• Decision Trees
• Logistic Regression
• Support vector Machines
Advantages of Supervised learning
• With the help of supervised learning, the model can predict the output
on the basis of prior experiences.
• In supervised learning, we can have an exact idea about the classes of
objects.
• Supervised learning model helps us to solve various real-world
problems such as fraud detection, spam filtering, etc.
Disadvantages of Supervised learning
• Supervised learning models are not suitable for handling the complex
tasks.
• Supervised learning cannot predict the correct output if the test data is
different from the training dataset.
• Training required lots of computation times.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
3. Unsupervised Learning
Ans: Unsupervised learning is a type of machine learning in which the
model is trained using unlabelled data, where the outcome or response
variable is not known. The goal of unsupervised learning is to find patterns
or structure in the data.
Types of Unsupervised Learning Algorithm
Clustering: Clustering is a method of grouping the objects into clusters
such that objects with most similarities remains into a group and has less
or no similarities with the objects of another group.
Association: An association rule is an unsupervised learning method
which is used for finding the relationships between variables in the large
database.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
4. States to implement Supervised Learning
Ans:
• Collect labelled data: Gather data with labelled examples of the input-
output relationship.
• Pre-process the data: Clean, format, and prepare the data for training
the model.
• Features selection: Choose the most important features from the data
to reduce dimensionality.
• Model selection: Choose an appropriate model that can learn the
input-output relationship from the data.
• Hyperparameter tuning: Adjust the model's hyperparameters to
optimize performance.
• Train the model: Feed the data into the model and adjust its
parameters to minimize the difference between predicted and actual
output.
• Evaluate the model's performance: Use various metrics to measure
the model's ability to generalize to new data.
5. Overfitting, Underfitting, bias-variance trade-off.
Ans:
Overfitting
• Overfitting occurs when our machine learning model tries to cover all
the data points or more than the required data points present in the
given dataset.
• Because of this, the model starts caching noise and inaccurate values
present in the dataset, and all these factors reduce the efficiency and
accuracy of the model.
• The overfitted model has low bias and high variance.
• The chances of occurrence of overfitting increase as much we provide
training to our model. It means the more we train our model, the more
chances of occurring the overfitted model.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Underfitting
• Underfitting occurs when our machine learning model is not able to
capture the underlying trend of the data.
• To avoid the overfitting in the model, the fed of training data can be
stopped at an early stage, due to which the model may not learn
enough from the training data.
• As a result, it may fail to find the best fit of the dominant trend in the
data.
• In the case of underfitting, the model is not able to learn enough from
the training data, and hence it reduces the accuracy and produces
unreliable predictions.
• An underfitted model has high bias and low variance.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Bias-Variance Trade Off
Reducible errors: These errors can be reduced to improve the model
accuracy. Such errors can further be classified into bias and Variance.
Irreducible errors: These errors will always be present in the model
Bias: While making predictions, a difference occurs between prediction
values made by the model and actual values/expected values, and this
difference is known as bias errors or Errors due to bias.
Variance: The variability of model prediction for a given data point which
tells us spread of our data is called the variance of the model.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Bias-Variance Trade-Off: While building the machine learning model, it
is really important to take care of bias and variance in order to avoid
overfitting and underfitting in the model. If the model is very simple with
fewer parameters, it may have low variance and high bias. Whereas, if the
model has a large number of parameters, it will have high variance and
low bias. So, it is required to make a balance between bias and variance
errors, and this balance between the bias error and variance error is known
as the Bias-Variance trade-off.
6. Common techniques to avoid overfitting
Ans:
• Early stopping: Early stopping pauses the training phase before the
machine learning model learns the noise in the data.
• Regularization: Regularization is a collection of training/optimization
techniques that seek to reduce overfitting. These methods try to
eliminate those factors that do not impact the prediction outcomes by
grading features based on importance.
• Cross-validation: Cross-validation is a technique used to estimate the
generalization error of a model. It involves splitting the data into
multiple subsets, where each subset is used for training and testing the
model.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
• Data augmentation: Data augmentation is a technique used to
increase the size of the training set by generating new, synthetic data
from the existing data.
• Dropout: Dropout is a regularization technique that randomly drops
out some of the nodes in a neural network during training.
• Ensemble learning: Ensemble learning is a technique used to combine
multiple models to improve performance and reduce overfitting.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 2
1. Linear Regression
• Ans: Linear regression is one of the easiest and most popular Machine
Learning algorithms.
• It is a statistical method that is used for predictive analysis.
• Linear regression makes predictions for continuous/real or numeric
variables such as sales, salary, age, product price, etc.
• Linear regression algorithm shows a linear relationship between a
dependent (y) and one or more independent (x) variables, hence called as
linear regression.
• Since linear regression shows the linear relationship, which means it finds
how the value of the dependent variable is changing according to the
value of the independent variable.
Equation
y= a0+a1x + ε
• Y = Dependent Variable (Target Variable)
• X = Independent Variable (predictor Variable)
• a0 = intercept of the line (Gives an additional degree of freedom)
• a1 = Linear regression coefficient (scale factor to each input value).
• ε = error
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Types of Linear Regression
Linear regression can be further divided into two types of the algorithm:
Simple Linear Regression: If a single independent variable is used to
predict the value of a numerical dependent variable, then such a Linear
Regression algorithm is called Simple Linear Regression.
Multiple Linear regression: If more than one independent variable is
used to predict the value of a numerical dependent variable, then such a
Linear Regression algorithm is called Multiple Linear Regression.
2. Principal Component Analysis (PCA)
Ans:
• Principal Component Analysis is an unsupervised learning algorithm that
is used for the dimensionality reduction in machine learning.
• It is a statistical process that converts the observations of correlated
features into a set of linearly uncorrelated features with the help of
orthogonal transformation. These new transformed features are called
the Principal Components.
• It is one of the popular tools that is used for exploratory data analysis
and predictive modelling.
• PCA generally tries to find the lower-dimensional surface to project the
high-dimensional data.
• Some real-world applications of PCA are image processing, movie
recommendation system, optimizing the power allocation in various
communication channels.
3. How to reduce dimensionality using PCA
Ans:
1. Import necessary libraries: All the necessary libraries required to load
the dataset.
2. Load the dataset: After importing all the necessary libraries, we need
to load the dataset.
3. Standardize the features: Before applying PCA or any other Machine
Learning technique it is always considered good practice to standardize
the data.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
4. Applying Principal Component Analysis: We will apply PCA on the
scaled dataset.
5. Checking Co-relation between features after PCA.
4. Selecting the principal components
Ans:
• Scree plot
• Kaiser criterion
• Proportion of explained variance
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 3
1. Linear classification
Ans: Linear classification is a type of classification algorithm used in
machine learning that separates data points into two or more classes using
a linear boundary or hyperplane.
Commonly used Linear classification algorithms
Logistic regression: A probabilistic linear classification algorithm that
models the probability of a data point belonging to each class using a
logistic function.
Linear discriminant analysis (LDA): A statistical linear classification
algorithm that models the distribution of the input features for each class
using a multivariate normal distribution and finds the boundary that
maximizes the separation between the classes.
Support vector machines (SVM): A linear classification algorithm that
finds the hyperplane that maximizes the margin between the two classes.
Perceptron Algorithm: It is a type of single-layer neural network that
takes in input data and produces an output based on a set of weights and
biases.
2. Logistic regression
Ans: Logistic regression is one of the most popular Machine
Learning algorithms, which comes under the Supervised
Learning technique.
It is a statistical model used to analyse the relationship between
a categorical dependent variable and one or more independent
variables. It is used to predict the probability of the occurrence
of an event by fitting data to a logistic function.
In logistic regression, the dependent variable is binary, meaning
it can take only two values, usually represented as 0 or 1. The
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
independent variables can be either categorical or continuous.
The goal is to find a relationship between the independent
variables and the probability of the dependent variable being 1.
Logistic Regression is much similar to the Linear Regression
except that how they are used. Linear Regression is used for
solving Regression problems, whereas Logistic regression is used
for solving the classification problems.
3. Support vector machines (SVM)
Ans: Support Vector Machine or SVM is one of the most popular
Supervised Learning algorithms, which is used for Classification as well as
Regression problems. However, primarily, it is used for Classification
problems in Machine Learning.
The goal of the SVM algorithm is to create the best line or decision
boundary that can segregate n-dimensional space into classes so that we
can easily put the new data point in the correct category in the future. This
best decision boundary is called a hyperplane.
Types of SVM
• Linear SVM
• Non-Linear SVM
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 4
1. Binary and multi-class classification.
Ans: Binary Classification: It is a process or task of classification, in
which a given data is being classified into two classes. It’s basically a kind
of prediction about which of two groups the thing belongs to.
Here are two discrete classes, one is spam and the other is primary. So,
this is a problem of binary classification.
Multi-Class Classification: Multi-class classification is the task of
classifying elements into different classes. Unlike binary, it doesn’t restrict
itself to any number of classes.
Examples of multi-class classification are
• classification of news in different categories,
• classifying books according to the subject,
• classifying students according to their streams etc.
2. Comparison of algorithms: SVM, linear perceptron,
logistic regression
Ans:
SVM: SVM is a powerful and flexible algorithm for binary classification and
also works well for multi-class classification tasks. It tries to find the
hyperplane that maximally separates the different classes in the data.
Perceptron: Perceptron is a simple and fast algorithm for binary
classification tasks. It works by finding a linear decision boundary that
separates the two classes.
Logistic Regression: Logistic Regression is a probabilistic algorithm for
binary classification tasks. It models the probability of an instance
belonging to a particular class using a logistic function.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
SVM Binary and Multi-class Both No
Perceptron Binary Linear No
Logistic Regression Binary and Multi-class Linear Yes
3. Advantages, disadvantages, and scenarios for using each
algorithm
Ans:
SVM
Advantage: Powerful and flexible algorithm for handling high-
dimensional feature spaces and non-linearly separable data.
Disadvantage: Can be computationally expensive for large datasets and
sensitive to the choice of kernel function.
Perceptron
Advantage: Simple and fast algorithm for linearly separable data.
Disadvantage: Limited in its ability to handle non-linearly separable data
and prone to overfitting with large number of features.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Logistic Regression
Advantage: Probabilistic algorithm that model’s probability of instance
belonging to particular class, works well for linearly separable data, and
less prone to overfitting.
Disadvantage: May not work well for non-linearly separable data and
assumes linear relationship between input features and output variable.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 5
1. Neural Network and Activation functions.
Ans:
Neural Network
Neural networks are an information processing paradigm inspired by the
human nervous system. Just like in the human nervous system, we have
biological neurons in the same way in neural networks we have artificial
neurons, artificial neurons are mathematical functions derived from
biological neurons.
A neural network is a type of artificial intelligence (AI) algorithm modeled
after the structure and function of the human brain. It consists of
interconnected nodes or artificial neurons that process and transmit
information through a series of layers. The input data is passed through
the network, and the neurons compute and modify the data based on the
strength of their connections, producing an output.
Neural networks are used in a variety of applications, including image and
speech recognition, natural language processing, etc.
Elements of a Neural Network
Input Layer: This layer accepts input features. It provides information
from the outside world to the network, no computation is performed at
this layer, nodes here just pass on the information(features) to the hidden
layer.
Hidden Layer: Nodes of this layer are not exposed to the outer world;
they are part of the abstraction provided by any neural network. The
hidden layer performs all sorts of computation on the features entered
through the input layer and transfers the result to the output layer.
Output Layer: his layer brings up the information learned by the network
to the outer world.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Activation Function
Activation functions are mathematical functions that are applied to the
output of each node in a neural network to introduce non-linearity into
the model. These functions allow neural networks to learn complex
patterns and relationships in data.
2. Perceptron algorithm and how it works
Ans:
The Perceptron algorithm is a linear classification algorithm used for
binary classification tasks. It is a type of supervised learning algorithm that
learns a decision boundary to separate the input data into two classes. The
decision boundary is represented as a hyperplane that separates the input
data into two regions.
Working
• Initialize the weights and bias: The Perceptron algorithm starts by
initializing the weights and bias to small random values.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
• Iterate over the training data: For each input data point, the
Perceptron algorithm computes the weighted sum of the inputs
multiplied by their respective weights, and adds the bias term. This is
the net input value.
• Apply the activation function: The net input value is then passed
through an activation function (usually a step function) to produce the
output value. If the output value is greater than or equal to 0, the
Perceptron predicts the positive class; otherwise, it predicts the
negative class.
• Update the weights and bias: If the Perceptron makes an incorrect
prediction, the weights and bias are updated to move the decision
boundary closer to the correct classification. Specifically, the weights
are updated by adding or subtracting the product of the input value
and the learning rate (a small positive value that controls the
magnitude of the weight update), multiplied by the error (the
difference between the predicted and true classes). The bias is updated
in a similar way, but without the input value.
• Repeat steps 2-4: The Perceptron algorithm iterates over the training
data multiple times (epochs) until the decision boundary converges
and no further weight updates are required.
Weight: Weight parameter represents the strength of the connection
between units. This is one of most important parameter of Perceptron
components.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
3. Backpropagation algorithm
Ans:
Backpropagation is one of the important concepts of a neural network.
The Backpropagation algorithm is used for training artificial neural
networks. It is a supervised learning algorithm that can be used for both
classification and regression tasks.
Working
• First, we initialize the weights and biases of the neural network to small
random values.
• Then, we feed an input data point into the network and compute the
output using the current weights and biases.
• Next, we calculate the error between the predicted output and the true
output.
• The error is then propagated backwards through the network to update
the weights and biases, using a technique called gradient descent.
• Gradient descent involves calculating the gradient of the error with
respect to each weight and bias, and then updating them in the
opposite direction.
• This process is repeated for multiple input data points, in batches or
one at a time, until the error is minimized and the network can
accurately predict the output for new input data.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
4. Hyperparameters and their selection
Ans: Hyperparameters are parameters that are not learned by the model
during training but are set by the user before training begins. They control
the behaviour of the learning algorithm and affect the performance of the
model.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 6
1. Decision trees
Ans: Decision Tree is a Supervised learning technique that can be used
for both classification and Regression problems, but mostly it is preferred
for solving Classification problems. It is a tree-structured classifier, where
internal nodes represent the features of a dataset, branches represent the
decision rules and each leaf node represents the outcome.
In a Decision tree, there are two nodes, which are the Decision Node and
Leaf Node. Decision nodes are used to make any decision and have
multiple branches, whereas Leaf nodes are the output of those decisions
and do not contain any further branches.
It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.
Root Node: Root node is from where the decision tree starts. It represents
the entire dataset, which further gets divided into two or more
homogeneous sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Splitting: Splitting is the process of dividing the decision node/root node
into sub-nodes according to the given conditions.
Example:
2. Pruning Techniques
Ans: Pruning is a process of deleting the unnecessary nodes from a tree
in order to get the optimal decision tree.
A too-large tree increases the risk of overfitting, and a small tree may not
capture all the important features of the dataset. Therefore, a technique
that decreases the size of the learning tree without reducing accuracy is
known as Pruning.
Types pruning technology
• Cost Complexity Pruning
• Reduced Error Pruning.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
3. Regression trees
Ans: A regression tree is basically a decision tree that is used for the task
of regression which can be used to predict continuous valued outputs
instead of discrete outputs.
Advantages of regression trees
• Visualization of data becomes easier as users can identify and process
each and every step.
• A specific decision node could be set to have a priority against other
decision nodes.
• As the regression tree progresses, undesired data will be filtered at
each step. As a result, only important data is left to process, which
increases the efficiency and accuracy of our design.
• It is easy to prepare regression trees – they can be used to present data
during meetings, presentations, etc.
4. Importance of stopping criteria
Ans: Stopping criteria are used in decision trees to determine when to
stop the recursive process of splitting the data into smaller subsets. The
stopping criteria help prevent overfitting, where the model becomes too
complex and starts to fit the noise in the data instead of the underlying
pattern.
commonly used stopping criteria in decision trees
• Maximum depth: This stopping criterion sets a limit on the maximum
depth of the tree. Once a tree reaches this depth, no more splits are
allowed, and the node becomes a leaf node.
• Minimum number of samples: This stopping criterion specifies a
minimum number of samples required to split a node. If a node has
fewer samples than this threshold, it is not split, and it becomes a leaf
node.
• Maximum number of leaf nodes: This stopping criterion sets a limit
on the maximum number of leaf nodes that a tree can have. Once the
tree has reached this limit, no more splits are allowed, and the tree is
pruned to remove unnecessary branches.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 7
1. ROC curve.
Ans: ROC (Receiver Operating Characteristic) curve is a graphical
plot that illustrates the performance of a binary classifier at
different classification thresholds. The curve is created by
plotting the true positive rate (TPR) against the false positive rate
(FPR) at various threshold settings.
The true positive rate (TPR) is also called sensitivity and is defined
as the fraction of positive instances that are correctly identified
by the classifier. The false positive rate (FPR) is defined as the
fraction of negative instances that are incorrectly classified as
positive by the classifier.
2. Cross-Validation and K-fold method
Ans: Cross-validation is a technique for validating the model efficiency
by training it on the subset of input data and testing on previously unseen
subset of the input data. We can also say that it is a technique to check
how a statistical model generalizes to an independent dataset.
The basic idea of cross-validation is to split the data into two or more
subsets, where one subset is used for training the model, and the other
subsets are used for testing the model. The training subset is used to fit
the model, and the testing subset is used to evaluate the performance of
the model on new, unseen data. This process is repeated multiple times
with different subsets, and the performance metrics are averaged to
obtain an estimate of the generalization performance.
K-Fold Cross-Validation
K-fold cross-validation approach divides the input dataset into K groups
of samples of equal sizes. These samples are called folds. The model is
trained on k-1 folds and tested on the remaining fold. This process is
repeated k times with each fold serving as the testing set once. The
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
performance metrics are averaged over the k iterations to obtain an
estimate of the generalization performance.
The steps for k-fold cross-validation
• Split the input dataset into K groups
• For each group:
o Take one group as the reserve or test data set.
o Use remaining groups as the training dataset
o Fit the model on the training set and evaluate the performance of
the model using the test set.
3. Ensemble Methods.
Ans: Ensemble methods in machine learning are techniques that combine
multiple individual models to create a more accurate and robust model.
Ensemble methods are based on the idea that combining multiple models
can often lead to better performance than using a single model.
4. Bagging and Boosting
Ans: Bagging, also known as Bootstrap aggregating, is an ensemble
learning technique that helps to improve the performance and accuracy
of machine learning algorithms. It is used to deal with bias-variance trade-
offs and reduces the variance of a prediction model. Bagging avoids
overfitting of data and is used for both regression and classification
models, specifically for decision tree algorithms.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Boosting is a machine learning technique for building ensemble models,
in which multiple weak learners are combined to form a strong learner. A
weak learner is a model that performs only slightly better than random
guessing. Boosting improves the accuracy of a weak learner by iteratively
training new models on the training data, and adjusting the weights of the
training data.
Types of Boosting
• Adaptive boosting
• Gradient Boosting
• Extreme gradient boosting
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 8
1. Random Forest
Ans: Random Forest is a popular machine learning algorithm that
belongs to the supervised learning technique. It can be used for both
Classification and Regression problems in ML. It is based on the concept
of ensemble learning. It is an ensemble learning method that combines
multiple decision trees to form a stronger model.
Random Forest has several advantages over other machine learning
algorithms. It can handle a large number of input features and can work
well even when some of the features are irrelevant or redundant. It is also
relatively insensitive to overfitting, since the predictions of the individual
trees are combined to form a more robust overall prediction.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
2. Naive Bayes and Bayesian Networks
Ans:
Bayesian Networks
Bayesian networks, also known as Bayes networks or belief networks, are
a type of probabilistic graphical model used in machine learning, statistics,
and artificial intelligence. They are used to represent and reason about
uncertainty and probability in complex systems.
Bayesian networks can be used for a variety of tasks in machine learning,
including classification, regression, anomaly detection, and decision
making under uncertainty. They have several advantages over other
machine learning algorithms, such as the ability to handle missing data
and the ability to incorporate domain knowledge into the model.
Naïve Bayes
• Naïve Bayes algorithm is a supervised learning algorithm, which is
based on Bayes theorem and used for solving classification problems.
• It is mainly used in text classification that includes a high-dimensional
training dataset.
• Naïve Bayes Classifier is one of the simple and most effective
Classification algorithms which helps in building the fast machine
learning models that can make quick predictions.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Week 12
1. Introduction to reinforcement learning
Ans: Reinforcement Learning is a feedback-based Machine learning
technique in which an AI agent (A software component) automatically
explore its surrounding by hitting & trail, taking action, learning from
experiences, and improving its performance. Agent gets rewarded for
each good action and get punished for each bad action; hence the goal
of reinforcement learning agent is to maximize the rewards.
In Reinforcement Learning, the agent learns automatically using
feedbacks without any labelled data, unlike supervised learning.
Agent: An entity that can perceive/explore the environment and act upon
it.
Environment: A situation in which an agent is present or surrounded by.
Action: Actions are the moves taken by an agent within the environment.
State: State is a situation returned by the environment after each action
taken by the agent.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
Reward: A feedback returned to the agent from the environment to
evaluate the action of the agent.
Advantages and Disadvantage
Advantages
• Flexibility: Reinforcement learning can be used in a variety of problem
domains, including robotics, games, and finance. It can also handle
problems with continuous state and action spaces.
• Adaptability: RL agents can learn from experience and adjust their
behaviour to changing environments.
• Autonomy: Once an RL agent has been trained, it can operate
autonomously, without the need for constant supervision.
• Optimal decision-making: Reinforcement learning can find the
optimal decision-making policy for a given task, which can lead to
better outcomes than human-designed policies.
Disadvantages
• Requires a large amount of data to learn an effective policy.
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload of states
which can weaken the results.
2. Features and applications of reinforcement learning
Ans:
Features
• In RL, the agent is not instructed about the environment and what
actions need to be taken.
• It is based on the hit and trial process.
• The agent takes the next action and changes states according to the
feedback of the previous action.
• The agent may get a delayed reward.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
• The environment is very complex, and the agent needs to explore it to
reach to get the maximum positive rewards.
Application
Robotics: Reinforcement learning has been applied to train robots to
perform tasks such as object manipulation, navigation, and grasping.
For example, RL has been used to teach robots to play table tennis,
fold towels, and open doors.
Game Playing: Reinforcement learning has been successfully applied
to train agents that can play games like Chess, Free Fire, PubG, etc.
Natural Language Processing: Reinforcement learning is being used
to improve natural language processing tasks such as machine
translation, question-answering, and chatbots. RL helps the agent
learn how to generate better responses based on the feedback it
receives.
Traffic light control: Traffic light control is another example where RL
can be used to optimize traffic flow and reduce congestion. RL
algorithms can learn from historical data and optimize the timing of
traffic signals to minimize waiting times, reduce traffic congestion,
and improve traffic flow. By learning from experience, traffic light
control systems can adjust signal timings dynamically based on traffic
volumes and congestion levels.
Driverless cars: Driverless cars use reinforcement learning to learn
how to navigate through traffic and make decisions based on the
environment they encounter. RL algorithms help the car learn from
past experiences, such as recognizing different objects on the road,
adapting to changing road conditions, and making decisions based
on the feedback it receives. Reinforcement learning is a critical
component in the development of autonomous vehicles.
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
3. Comparison of reinforcement learning and deep learning
Ans:
Reinforcement Learning (RL) Deep Learning (DL)
Objective
Learn to take actions to maximize a
reward signal
Learn to generalize patterns in
data
Feedback
Received in the form of rewards or
penalties based on actions taken in
an environment
Received as error between
predicted and actual outputs
Training
Requires trial-and-error approach
and feedback mechanism to guide
learning
Trained using input data and
backpropagation to minimize
error
Use Cases
Robotics, game playing, autonomous
systems
Image recognition, speech
recognition, natural language
processing, data analysis
Data
Requirements
Requires fewer input data but more
feedback data
Requires large amounts of input
data
Computational
Resources
Requires less computational
resources compared to DL
Requires more computational
resources compared to RL
NPTL – Machine Learning
Created By: Madhur Jatiya
Email: Madhurjatiya13@gmail.com
4. Comparison of supervised and reinforcement learning
Ans:
Reinforcement Learning Supervised Learning
RL works by interacting with the
environment.
Supervised learning works on the existing
dataset.
The RL algorithm works like the human
brain works when making some decisions.
Supervised Learning works as when a
human learns things in the supervision of a
guide.
There is no labelled dataset is present The labelled dataset is present.
No previous training is provided to the
learning agent.
Training is provided to the algorithm so
that it can predict the output.
RL helps to take decisions sequentially. In Supervised learning, decisions are made
when input is given.

More Related Content

Similar to NPTL Machine Learning Syllabus

Machine learning --Introduction.pptx
Machine learning --Introduction.pptxMachine learning --Introduction.pptx
Machine learning --Introduction.pptxvinivijayan4
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applicationsBenjaminlapid1
 
It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!To Sum It Up
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxkprasad8
 
AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxMohammadAsim91
 
Modelling and evaluation
Modelling and evaluationModelling and evaluation
Modelling and evaluationeShikshak
 
Introduction to Data Analytics.pptx
Introduction to Data Analytics.pptxIntroduction to Data Analytics.pptx
Introduction to Data Analytics.pptxDikshantSharma63
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxiaeronlineexm
 
The 8 Step Data Mining Process
The 8 Step Data Mining ProcessThe 8 Step Data Mining Process
The 8 Step Data Mining ProcessMarc Berman
 
Chapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdfChapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdfAschalewAyele2
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxChitrachitrap
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersSatyam Jaiswal
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxssuser957b41
 
Machine learning it is time...
Machine learning it is time...Machine learning it is time...
Machine learning it is time...Sandip Chatterjee
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTanvir Moin
 
Learning to Teach: Improving Instruction with Machine Learning Techniques
Learning to Teach: Improving Instruction with Machine Learning TechniquesLearning to Teach: Improving Instruction with Machine Learning Techniques
Learning to Teach: Improving Instruction with Machine Learning TechniquesBeverly Park Woolf
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 

Similar to NPTL Machine Learning Syllabus (20)

Machine learning --Introduction.pptx
Machine learning --Introduction.pptxMachine learning --Introduction.pptx
Machine learning --Introduction.pptx
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
Statistical learning intro
Statistical learning introStatistical learning intro
Statistical learning intro
 
AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptx
 
Endsem AI merged.pdf
Endsem AI merged.pdfEndsem AI merged.pdf
Endsem AI merged.pdf
 
Modelling and evaluation
Modelling and evaluationModelling and evaluation
Modelling and evaluation
 
Introduction to Data Analytics.pptx
Introduction to Data Analytics.pptxIntroduction to Data Analytics.pptx
Introduction to Data Analytics.pptx
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
The 8 Step Data Mining Process
The 8 Step Data Mining ProcessThe 8 Step Data Mining Process
The 8 Step Data Mining Process
 
Chapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdfChapter 4 Classification in data sience .pdf
Chapter 4 Classification in data sience .pdf
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptx
 
Machine learning it is time...
Machine learning it is time...Machine learning it is time...
Machine learning it is time...
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike Moin
 
Learning to Teach: Improving Instruction with Machine Learning Techniques
Learning to Teach: Improving Instruction with Machine Learning TechniquesLearning to Teach: Improving Instruction with Machine Learning Techniques
Learning to Teach: Improving Instruction with Machine Learning Techniques
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 

More from Mr. Moms

NPTL - Ethical Hacking by Madhur Jatiya.pdf
NPTL - Ethical Hacking by Madhur Jatiya.pdfNPTL - Ethical Hacking by Madhur Jatiya.pdf
NPTL - Ethical Hacking by Madhur Jatiya.pdfMr. Moms
 
NPTL Machine Learning Week 2.docx
NPTL Machine Learning Week 2.docxNPTL Machine Learning Week 2.docx
NPTL Machine Learning Week 2.docxMr. Moms
 
IET~DAVV STUDY MATERIALS.pptx
IET~DAVV STUDY MATERIALS.pptxIET~DAVV STUDY MATERIALS.pptx
IET~DAVV STUDY MATERIALS.pptxMr. Moms
 
IET~DAVV STUDY MATERIALS report.docx
 IET~DAVV STUDY MATERIALS report.docx IET~DAVV STUDY MATERIALS report.docx
IET~DAVV STUDY MATERIALS report.docxMr. Moms
 
IET~DAVV STUDY MATERIALS SRS.docx
IET~DAVV STUDY MATERIALS SRS.docxIET~DAVV STUDY MATERIALS SRS.docx
IET~DAVV STUDY MATERIALS SRS.docxMr. Moms
 
Banking Management System SRS
Banking Management System SRSBanking Management System SRS
Banking Management System SRSMr. Moms
 
Banking Management System SDS
Banking Management System SDSBanking Management System SDS
Banking Management System SDSMr. Moms
 
Banking Management System Synopsys
Banking Management System SynopsysBanking Management System Synopsys
Banking Management System SynopsysMr. Moms
 

More from Mr. Moms (8)

NPTL - Ethical Hacking by Madhur Jatiya.pdf
NPTL - Ethical Hacking by Madhur Jatiya.pdfNPTL - Ethical Hacking by Madhur Jatiya.pdf
NPTL - Ethical Hacking by Madhur Jatiya.pdf
 
NPTL Machine Learning Week 2.docx
NPTL Machine Learning Week 2.docxNPTL Machine Learning Week 2.docx
NPTL Machine Learning Week 2.docx
 
IET~DAVV STUDY MATERIALS.pptx
IET~DAVV STUDY MATERIALS.pptxIET~DAVV STUDY MATERIALS.pptx
IET~DAVV STUDY MATERIALS.pptx
 
IET~DAVV STUDY MATERIALS report.docx
 IET~DAVV STUDY MATERIALS report.docx IET~DAVV STUDY MATERIALS report.docx
IET~DAVV STUDY MATERIALS report.docx
 
IET~DAVV STUDY MATERIALS SRS.docx
IET~DAVV STUDY MATERIALS SRS.docxIET~DAVV STUDY MATERIALS SRS.docx
IET~DAVV STUDY MATERIALS SRS.docx
 
Banking Management System SRS
Banking Management System SRSBanking Management System SRS
Banking Management System SRS
 
Banking Management System SDS
Banking Management System SDSBanking Management System SDS
Banking Management System SDS
 
Banking Management System Synopsys
Banking Management System SynopsysBanking Management System Synopsys
Banking Management System Synopsys
 

Recently uploaded

(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...ranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 

Recently uploaded (20)

(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 

NPTL Machine Learning Syllabus

  • 1. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Machine Learning Syllabus Week 1: Introduction: Statistical Decision Theory - Regression, Classification, Bias Variance. Week 2: Linear Regression, Multivariate Regression, Subset Selection, Shrinkage Methods, Principal Component Regression, Partial Least squares. Week 3: Linear Classification, Logistic Regression, Linear Discriminant Analysis. Week 4: Perceptron, Support Vector Machines. Week 5: Neural Networks Introduction, Early Models, Perceptron Learning, Backpropagation, Initialization, Training & Validation, Parameter Estimation - MLE, MAP, Bayesian Estimation. Week 6: Decision Trees, Regression Trees, Stopping Criterion & Pruning loss functions, Categorical Attributes, Multiway Splits, Missing Values, Decision Trees - Instability Evaluation Measures. Week 7: Bootstrapping & Cross Validation, Class Evaluation Measures, ROC curve, MDL, Ensemble Methods - Bagging, Committee Machines and Stacking, Boosting Week 8: Gradient Boosting, Random Forests, Multi-class Classification, Naive Bayes, Bayesian Networks Week 9: Undirected Graphical Models, HMM, Variable, Elimination, Belief Propagation. Week 10: Partitional Clustering, Hierarchical Clustering, Birch Algorithm, CURE Algorithm, Density-based Clustering. Week 11: Gaussian Mixture Models, Expectation Maximization. Week 12: Learning Theory, Introduction to Reinforcement Learning, Optional videos (RL framework, TD learning. Solution Methods, Applications).
  • 2. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Important Questions Week 1 • What is machine learning? • Supervised Learning • Unsupervised learning • States to implement supervised learning • Classification, overfitting, bias-variance trade-off • Common techniques to avoid overfitting Week 2 • Linear Regression • Principal Component Analysis (PCA) • How to reduce dimensionality using PCA • Selecting the principal components Week 3 • Linear classification • Logistic regression • Support Vector Machines Week 4 • Binary and multi-class classification contexts • Comparison of algorithms: SVM, perceptron, logistic regression • Advantages, disadvantages, and scenarios for using each algorithm Week 5 • Neural Network and activation functions • Perceptron algorithm and how it works • Backpropagation algorithm • Hyperparameters and their selection
  • 3. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 6 • Decision Tree • Pruning Techniques • Regression tree • Importance of stopping criteria Week 7 • ROC Curve • Cross Validation and K-fold method • Ensemble Methods • Bagging and boosting techniques Week 8 • Random Forest • Naive Bayes and Bayesian Networks Week 12 • Reinforcement learning • Comparison of Supervised and Reinforcement Learning • Features and Applications of Reinforcement Learning • Comparison of Reinforcement Learning and Deep Learning
  • 4. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 1 1. What is Machine Learning? Ans: Machine learning is the process of using algorithms and statistical models to allow a computer system to improve its performance on a specific task by learning from data, without being explicitly programmed. It involves training a model on a dataset and then using that model to make predictions on new data. Classification of Machine Learning • Supervised learning • Unsupervised learning • Reinforcement learning Application of Machine Learning • Image Recognition: Image recognition is one of the most common applications of machine learning. It is used to identify objects, persons, places, digital images, etc. The popular use case of image recognition and face detection.
  • 5. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com • Speech Recognition: While using Google, we get an option of "Search by voice," it comes under speech recognition, and it's a popular application of machine learning. • Traffic prediction: If we want to visit a new place, we take help of Google Maps, which shows us the correct path with the shortest route and predicts the traffic conditions. • Self-driving cars: One of the most exciting applications of machine learning is self-driving cars. 2. Supervised Learning Ans: Supervised learning is a type of machine learning in which the model is trained using labelled data, where the outcome or response variable is known. The goal of SL is to create a model that can accurately predict the output for new input data. Types of supervised Machine learning Algorithms
  • 6. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Regression: Regression algorithms are used if there is a relationship between the input variable and the output variable. It is used for the prediction of continuous variables, such as Weather forecasting, Market Trends, etc. • Linear Regression • Non-Linear Regression • Regression Trees • Bayesian Linear Regression • Polynomial Regression Classification: Classification algorithms are used when the output variable is categorical, which means there are two classes such as Yes-No, Male-Female, True-false, etc. • Random Forest • Decision Trees • Logistic Regression • Support vector Machines Advantages of Supervised learning • With the help of supervised learning, the model can predict the output on the basis of prior experiences. • In supervised learning, we can have an exact idea about the classes of objects. • Supervised learning model helps us to solve various real-world problems such as fraud detection, spam filtering, etc. Disadvantages of Supervised learning • Supervised learning models are not suitable for handling the complex tasks. • Supervised learning cannot predict the correct output if the test data is different from the training dataset. • Training required lots of computation times.
  • 7. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 3. Unsupervised Learning Ans: Unsupervised learning is a type of machine learning in which the model is trained using unlabelled data, where the outcome or response variable is not known. The goal of unsupervised learning is to find patterns or structure in the data. Types of Unsupervised Learning Algorithm Clustering: Clustering is a method of grouping the objects into clusters such that objects with most similarities remains into a group and has less or no similarities with the objects of another group. Association: An association rule is an unsupervised learning method which is used for finding the relationships between variables in the large database.
  • 8. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 4. States to implement Supervised Learning Ans: • Collect labelled data: Gather data with labelled examples of the input- output relationship. • Pre-process the data: Clean, format, and prepare the data for training the model. • Features selection: Choose the most important features from the data to reduce dimensionality. • Model selection: Choose an appropriate model that can learn the input-output relationship from the data. • Hyperparameter tuning: Adjust the model's hyperparameters to optimize performance. • Train the model: Feed the data into the model and adjust its parameters to minimize the difference between predicted and actual output. • Evaluate the model's performance: Use various metrics to measure the model's ability to generalize to new data. 5. Overfitting, Underfitting, bias-variance trade-off. Ans: Overfitting • Overfitting occurs when our machine learning model tries to cover all the data points or more than the required data points present in the given dataset. • Because of this, the model starts caching noise and inaccurate values present in the dataset, and all these factors reduce the efficiency and accuracy of the model. • The overfitted model has low bias and high variance. • The chances of occurrence of overfitting increase as much we provide training to our model. It means the more we train our model, the more chances of occurring the overfitted model.
  • 9. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Underfitting • Underfitting occurs when our machine learning model is not able to capture the underlying trend of the data. • To avoid the overfitting in the model, the fed of training data can be stopped at an early stage, due to which the model may not learn enough from the training data. • As a result, it may fail to find the best fit of the dominant trend in the data. • In the case of underfitting, the model is not able to learn enough from the training data, and hence it reduces the accuracy and produces unreliable predictions. • An underfitted model has high bias and low variance.
  • 10. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Bias-Variance Trade Off Reducible errors: These errors can be reduced to improve the model accuracy. Such errors can further be classified into bias and Variance. Irreducible errors: These errors will always be present in the model Bias: While making predictions, a difference occurs between prediction values made by the model and actual values/expected values, and this difference is known as bias errors or Errors due to bias. Variance: The variability of model prediction for a given data point which tells us spread of our data is called the variance of the model.
  • 11. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Bias-Variance Trade-Off: While building the machine learning model, it is really important to take care of bias and variance in order to avoid overfitting and underfitting in the model. If the model is very simple with fewer parameters, it may have low variance and high bias. Whereas, if the model has a large number of parameters, it will have high variance and low bias. So, it is required to make a balance between bias and variance errors, and this balance between the bias error and variance error is known as the Bias-Variance trade-off. 6. Common techniques to avoid overfitting Ans: • Early stopping: Early stopping pauses the training phase before the machine learning model learns the noise in the data. • Regularization: Regularization is a collection of training/optimization techniques that seek to reduce overfitting. These methods try to eliminate those factors that do not impact the prediction outcomes by grading features based on importance. • Cross-validation: Cross-validation is a technique used to estimate the generalization error of a model. It involves splitting the data into multiple subsets, where each subset is used for training and testing the model.
  • 12. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com • Data augmentation: Data augmentation is a technique used to increase the size of the training set by generating new, synthetic data from the existing data. • Dropout: Dropout is a regularization technique that randomly drops out some of the nodes in a neural network during training. • Ensemble learning: Ensemble learning is a technique used to combine multiple models to improve performance and reduce overfitting.
  • 13. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 2 1. Linear Regression • Ans: Linear regression is one of the easiest and most popular Machine Learning algorithms. • It is a statistical method that is used for predictive analysis. • Linear regression makes predictions for continuous/real or numeric variables such as sales, salary, age, product price, etc. • Linear regression algorithm shows a linear relationship between a dependent (y) and one or more independent (x) variables, hence called as linear regression. • Since linear regression shows the linear relationship, which means it finds how the value of the dependent variable is changing according to the value of the independent variable. Equation y= a0+a1x + ε • Y = Dependent Variable (Target Variable) • X = Independent Variable (predictor Variable) • a0 = intercept of the line (Gives an additional degree of freedom) • a1 = Linear regression coefficient (scale factor to each input value). • ε = error
  • 14. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Types of Linear Regression Linear regression can be further divided into two types of the algorithm: Simple Linear Regression: If a single independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Simple Linear Regression. Multiple Linear regression: If more than one independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Multiple Linear Regression. 2. Principal Component Analysis (PCA) Ans: • Principal Component Analysis is an unsupervised learning algorithm that is used for the dimensionality reduction in machine learning. • It is a statistical process that converts the observations of correlated features into a set of linearly uncorrelated features with the help of orthogonal transformation. These new transformed features are called the Principal Components. • It is one of the popular tools that is used for exploratory data analysis and predictive modelling. • PCA generally tries to find the lower-dimensional surface to project the high-dimensional data. • Some real-world applications of PCA are image processing, movie recommendation system, optimizing the power allocation in various communication channels. 3. How to reduce dimensionality using PCA Ans: 1. Import necessary libraries: All the necessary libraries required to load the dataset. 2. Load the dataset: After importing all the necessary libraries, we need to load the dataset. 3. Standardize the features: Before applying PCA or any other Machine Learning technique it is always considered good practice to standardize the data.
  • 15. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 4. Applying Principal Component Analysis: We will apply PCA on the scaled dataset. 5. Checking Co-relation between features after PCA. 4. Selecting the principal components Ans: • Scree plot • Kaiser criterion • Proportion of explained variance
  • 16. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 3 1. Linear classification Ans: Linear classification is a type of classification algorithm used in machine learning that separates data points into two or more classes using a linear boundary or hyperplane. Commonly used Linear classification algorithms Logistic regression: A probabilistic linear classification algorithm that models the probability of a data point belonging to each class using a logistic function. Linear discriminant analysis (LDA): A statistical linear classification algorithm that models the distribution of the input features for each class using a multivariate normal distribution and finds the boundary that maximizes the separation between the classes. Support vector machines (SVM): A linear classification algorithm that finds the hyperplane that maximizes the margin between the two classes. Perceptron Algorithm: It is a type of single-layer neural network that takes in input data and produces an output based on a set of weights and biases. 2. Logistic regression Ans: Logistic regression is one of the most popular Machine Learning algorithms, which comes under the Supervised Learning technique. It is a statistical model used to analyse the relationship between a categorical dependent variable and one or more independent variables. It is used to predict the probability of the occurrence of an event by fitting data to a logistic function. In logistic regression, the dependent variable is binary, meaning it can take only two values, usually represented as 0 or 1. The
  • 17. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com independent variables can be either categorical or continuous. The goal is to find a relationship between the independent variables and the probability of the dependent variable being 1. Logistic Regression is much similar to the Linear Regression except that how they are used. Linear Regression is used for solving Regression problems, whereas Logistic regression is used for solving the classification problems. 3. Support vector machines (SVM) Ans: Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as well as Regression problems. However, primarily, it is used for Classification problems in Machine Learning. The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane. Types of SVM • Linear SVM • Non-Linear SVM
  • 18. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 4 1. Binary and multi-class classification. Ans: Binary Classification: It is a process or task of classification, in which a given data is being classified into two classes. It’s basically a kind of prediction about which of two groups the thing belongs to. Here are two discrete classes, one is spam and the other is primary. So, this is a problem of binary classification. Multi-Class Classification: Multi-class classification is the task of classifying elements into different classes. Unlike binary, it doesn’t restrict itself to any number of classes. Examples of multi-class classification are • classification of news in different categories, • classifying books according to the subject, • classifying students according to their streams etc. 2. Comparison of algorithms: SVM, linear perceptron, logistic regression Ans: SVM: SVM is a powerful and flexible algorithm for binary classification and also works well for multi-class classification tasks. It tries to find the hyperplane that maximally separates the different classes in the data. Perceptron: Perceptron is a simple and fast algorithm for binary classification tasks. It works by finding a linear decision boundary that separates the two classes. Logistic Regression: Logistic Regression is a probabilistic algorithm for binary classification tasks. It models the probability of an instance belonging to a particular class using a logistic function.
  • 19. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com SVM Binary and Multi-class Both No Perceptron Binary Linear No Logistic Regression Binary and Multi-class Linear Yes 3. Advantages, disadvantages, and scenarios for using each algorithm Ans: SVM Advantage: Powerful and flexible algorithm for handling high- dimensional feature spaces and non-linearly separable data. Disadvantage: Can be computationally expensive for large datasets and sensitive to the choice of kernel function. Perceptron Advantage: Simple and fast algorithm for linearly separable data. Disadvantage: Limited in its ability to handle non-linearly separable data and prone to overfitting with large number of features.
  • 20. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Logistic Regression Advantage: Probabilistic algorithm that model’s probability of instance belonging to particular class, works well for linearly separable data, and less prone to overfitting. Disadvantage: May not work well for non-linearly separable data and assumes linear relationship between input features and output variable.
  • 21. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 5 1. Neural Network and Activation functions. Ans: Neural Network Neural networks are an information processing paradigm inspired by the human nervous system. Just like in the human nervous system, we have biological neurons in the same way in neural networks we have artificial neurons, artificial neurons are mathematical functions derived from biological neurons. A neural network is a type of artificial intelligence (AI) algorithm modeled after the structure and function of the human brain. It consists of interconnected nodes or artificial neurons that process and transmit information through a series of layers. The input data is passed through the network, and the neurons compute and modify the data based on the strength of their connections, producing an output. Neural networks are used in a variety of applications, including image and speech recognition, natural language processing, etc. Elements of a Neural Network Input Layer: This layer accepts input features. It provides information from the outside world to the network, no computation is performed at this layer, nodes here just pass on the information(features) to the hidden layer. Hidden Layer: Nodes of this layer are not exposed to the outer world; they are part of the abstraction provided by any neural network. The hidden layer performs all sorts of computation on the features entered through the input layer and transfers the result to the output layer. Output Layer: his layer brings up the information learned by the network to the outer world.
  • 22. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Activation Function Activation functions are mathematical functions that are applied to the output of each node in a neural network to introduce non-linearity into the model. These functions allow neural networks to learn complex patterns and relationships in data. 2. Perceptron algorithm and how it works Ans: The Perceptron algorithm is a linear classification algorithm used for binary classification tasks. It is a type of supervised learning algorithm that learns a decision boundary to separate the input data into two classes. The decision boundary is represented as a hyperplane that separates the input data into two regions. Working • Initialize the weights and bias: The Perceptron algorithm starts by initializing the weights and bias to small random values.
  • 23. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com • Iterate over the training data: For each input data point, the Perceptron algorithm computes the weighted sum of the inputs multiplied by their respective weights, and adds the bias term. This is the net input value. • Apply the activation function: The net input value is then passed through an activation function (usually a step function) to produce the output value. If the output value is greater than or equal to 0, the Perceptron predicts the positive class; otherwise, it predicts the negative class. • Update the weights and bias: If the Perceptron makes an incorrect prediction, the weights and bias are updated to move the decision boundary closer to the correct classification. Specifically, the weights are updated by adding or subtracting the product of the input value and the learning rate (a small positive value that controls the magnitude of the weight update), multiplied by the error (the difference between the predicted and true classes). The bias is updated in a similar way, but without the input value. • Repeat steps 2-4: The Perceptron algorithm iterates over the training data multiple times (epochs) until the decision boundary converges and no further weight updates are required. Weight: Weight parameter represents the strength of the connection between units. This is one of most important parameter of Perceptron components.
  • 24. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 3. Backpropagation algorithm Ans: Backpropagation is one of the important concepts of a neural network. The Backpropagation algorithm is used for training artificial neural networks. It is a supervised learning algorithm that can be used for both classification and regression tasks. Working • First, we initialize the weights and biases of the neural network to small random values. • Then, we feed an input data point into the network and compute the output using the current weights and biases. • Next, we calculate the error between the predicted output and the true output. • The error is then propagated backwards through the network to update the weights and biases, using a technique called gradient descent. • Gradient descent involves calculating the gradient of the error with respect to each weight and bias, and then updating them in the opposite direction. • This process is repeated for multiple input data points, in batches or one at a time, until the error is minimized and the network can accurately predict the output for new input data.
  • 25. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 4. Hyperparameters and their selection Ans: Hyperparameters are parameters that are not learned by the model during training but are set by the user before training begins. They control the behaviour of the learning algorithm and affect the performance of the model.
  • 26. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 6 1. Decision trees Ans: Decision Tree is a Supervised learning technique that can be used for both classification and Regression problems, but mostly it is preferred for solving Classification problems. It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches represent the decision rules and each leaf node represents the outcome. In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not contain any further branches. It is a graphical representation for getting all the possible solutions to a problem/decision based on given conditions. Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which further gets divided into two or more homogeneous sets. Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting a leaf node.
  • 27. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the given conditions. Example: 2. Pruning Techniques Ans: Pruning is a process of deleting the unnecessary nodes from a tree in order to get the optimal decision tree. A too-large tree increases the risk of overfitting, and a small tree may not capture all the important features of the dataset. Therefore, a technique that decreases the size of the learning tree without reducing accuracy is known as Pruning. Types pruning technology • Cost Complexity Pruning • Reduced Error Pruning.
  • 28. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 3. Regression trees Ans: A regression tree is basically a decision tree that is used for the task of regression which can be used to predict continuous valued outputs instead of discrete outputs. Advantages of regression trees • Visualization of data becomes easier as users can identify and process each and every step. • A specific decision node could be set to have a priority against other decision nodes. • As the regression tree progresses, undesired data will be filtered at each step. As a result, only important data is left to process, which increases the efficiency and accuracy of our design. • It is easy to prepare regression trees – they can be used to present data during meetings, presentations, etc. 4. Importance of stopping criteria Ans: Stopping criteria are used in decision trees to determine when to stop the recursive process of splitting the data into smaller subsets. The stopping criteria help prevent overfitting, where the model becomes too complex and starts to fit the noise in the data instead of the underlying pattern. commonly used stopping criteria in decision trees • Maximum depth: This stopping criterion sets a limit on the maximum depth of the tree. Once a tree reaches this depth, no more splits are allowed, and the node becomes a leaf node. • Minimum number of samples: This stopping criterion specifies a minimum number of samples required to split a node. If a node has fewer samples than this threshold, it is not split, and it becomes a leaf node. • Maximum number of leaf nodes: This stopping criterion sets a limit on the maximum number of leaf nodes that a tree can have. Once the tree has reached this limit, no more splits are allowed, and the tree is pruned to remove unnecessary branches.
  • 29. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 7 1. ROC curve. Ans: ROC (Receiver Operating Characteristic) curve is a graphical plot that illustrates the performance of a binary classifier at different classification thresholds. The curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The true positive rate (TPR) is also called sensitivity and is defined as the fraction of positive instances that are correctly identified by the classifier. The false positive rate (FPR) is defined as the fraction of negative instances that are incorrectly classified as positive by the classifier. 2. Cross-Validation and K-fold method Ans: Cross-validation is a technique for validating the model efficiency by training it on the subset of input data and testing on previously unseen subset of the input data. We can also say that it is a technique to check how a statistical model generalizes to an independent dataset. The basic idea of cross-validation is to split the data into two or more subsets, where one subset is used for training the model, and the other subsets are used for testing the model. The training subset is used to fit the model, and the testing subset is used to evaluate the performance of the model on new, unseen data. This process is repeated multiple times with different subsets, and the performance metrics are averaged to obtain an estimate of the generalization performance. K-Fold Cross-Validation K-fold cross-validation approach divides the input dataset into K groups of samples of equal sizes. These samples are called folds. The model is trained on k-1 folds and tested on the remaining fold. This process is repeated k times with each fold serving as the testing set once. The
  • 30. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com performance metrics are averaged over the k iterations to obtain an estimate of the generalization performance. The steps for k-fold cross-validation • Split the input dataset into K groups • For each group: o Take one group as the reserve or test data set. o Use remaining groups as the training dataset o Fit the model on the training set and evaluate the performance of the model using the test set. 3. Ensemble Methods. Ans: Ensemble methods in machine learning are techniques that combine multiple individual models to create a more accurate and robust model. Ensemble methods are based on the idea that combining multiple models can often lead to better performance than using a single model. 4. Bagging and Boosting Ans: Bagging, also known as Bootstrap aggregating, is an ensemble learning technique that helps to improve the performance and accuracy of machine learning algorithms. It is used to deal with bias-variance trade- offs and reduces the variance of a prediction model. Bagging avoids overfitting of data and is used for both regression and classification models, specifically for decision tree algorithms.
  • 31. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Boosting is a machine learning technique for building ensemble models, in which multiple weak learners are combined to form a strong learner. A weak learner is a model that performs only slightly better than random guessing. Boosting improves the accuracy of a weak learner by iteratively training new models on the training data, and adjusting the weights of the training data. Types of Boosting • Adaptive boosting • Gradient Boosting • Extreme gradient boosting
  • 32. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 8 1. Random Forest Ans: Random Forest is a popular machine learning algorithm that belongs to the supervised learning technique. It can be used for both Classification and Regression problems in ML. It is based on the concept of ensemble learning. It is an ensemble learning method that combines multiple decision trees to form a stronger model. Random Forest has several advantages over other machine learning algorithms. It can handle a large number of input features and can work well even when some of the features are irrelevant or redundant. It is also relatively insensitive to overfitting, since the predictions of the individual trees are combined to form a more robust overall prediction.
  • 33. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 2. Naive Bayes and Bayesian Networks Ans: Bayesian Networks Bayesian networks, also known as Bayes networks or belief networks, are a type of probabilistic graphical model used in machine learning, statistics, and artificial intelligence. They are used to represent and reason about uncertainty and probability in complex systems. Bayesian networks can be used for a variety of tasks in machine learning, including classification, regression, anomaly detection, and decision making under uncertainty. They have several advantages over other machine learning algorithms, such as the ability to handle missing data and the ability to incorporate domain knowledge into the model. Naïve Bayes • Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and used for solving classification problems. • It is mainly used in text classification that includes a high-dimensional training dataset. • Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps in building the fast machine learning models that can make quick predictions.
  • 34. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Week 12 1. Introduction to reinforcement learning Ans: Reinforcement Learning is a feedback-based Machine learning technique in which an AI agent (A software component) automatically explore its surrounding by hitting & trail, taking action, learning from experiences, and improving its performance. Agent gets rewarded for each good action and get punished for each bad action; hence the goal of reinforcement learning agent is to maximize the rewards. In Reinforcement Learning, the agent learns automatically using feedbacks without any labelled data, unlike supervised learning. Agent: An entity that can perceive/explore the environment and act upon it. Environment: A situation in which an agent is present or surrounded by. Action: Actions are the moves taken by an agent within the environment. State: State is a situation returned by the environment after each action taken by the agent.
  • 35. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com Reward: A feedback returned to the agent from the environment to evaluate the action of the agent. Advantages and Disadvantage Advantages • Flexibility: Reinforcement learning can be used in a variety of problem domains, including robotics, games, and finance. It can also handle problems with continuous state and action spaces. • Adaptability: RL agents can learn from experience and adjust their behaviour to changing environments. • Autonomy: Once an RL agent has been trained, it can operate autonomously, without the need for constant supervision. • Optimal decision-making: Reinforcement learning can find the optimal decision-making policy for a given task, which can lead to better outcomes than human-designed policies. Disadvantages • Requires a large amount of data to learn an effective policy. • RL algorithms are not preferred for simple problems. • RL algorithms require huge data and computations. • Too much reinforcement learning can lead to an overload of states which can weaken the results. 2. Features and applications of reinforcement learning Ans: Features • In RL, the agent is not instructed about the environment and what actions need to be taken. • It is based on the hit and trial process. • The agent takes the next action and changes states according to the feedback of the previous action. • The agent may get a delayed reward.
  • 36. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com • The environment is very complex, and the agent needs to explore it to reach to get the maximum positive rewards. Application Robotics: Reinforcement learning has been applied to train robots to perform tasks such as object manipulation, navigation, and grasping. For example, RL has been used to teach robots to play table tennis, fold towels, and open doors. Game Playing: Reinforcement learning has been successfully applied to train agents that can play games like Chess, Free Fire, PubG, etc. Natural Language Processing: Reinforcement learning is being used to improve natural language processing tasks such as machine translation, question-answering, and chatbots. RL helps the agent learn how to generate better responses based on the feedback it receives. Traffic light control: Traffic light control is another example where RL can be used to optimize traffic flow and reduce congestion. RL algorithms can learn from historical data and optimize the timing of traffic signals to minimize waiting times, reduce traffic congestion, and improve traffic flow. By learning from experience, traffic light control systems can adjust signal timings dynamically based on traffic volumes and congestion levels. Driverless cars: Driverless cars use reinforcement learning to learn how to navigate through traffic and make decisions based on the environment they encounter. RL algorithms help the car learn from past experiences, such as recognizing different objects on the road, adapting to changing road conditions, and making decisions based on the feedback it receives. Reinforcement learning is a critical component in the development of autonomous vehicles.
  • 37. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 3. Comparison of reinforcement learning and deep learning Ans: Reinforcement Learning (RL) Deep Learning (DL) Objective Learn to take actions to maximize a reward signal Learn to generalize patterns in data Feedback Received in the form of rewards or penalties based on actions taken in an environment Received as error between predicted and actual outputs Training Requires trial-and-error approach and feedback mechanism to guide learning Trained using input data and backpropagation to minimize error Use Cases Robotics, game playing, autonomous systems Image recognition, speech recognition, natural language processing, data analysis Data Requirements Requires fewer input data but more feedback data Requires large amounts of input data Computational Resources Requires less computational resources compared to DL Requires more computational resources compared to RL
  • 38. NPTL – Machine Learning Created By: Madhur Jatiya Email: Madhurjatiya13@gmail.com 4. Comparison of supervised and reinforcement learning Ans: Reinforcement Learning Supervised Learning RL works by interacting with the environment. Supervised learning works on the existing dataset. The RL algorithm works like the human brain works when making some decisions. Supervised Learning works as when a human learns things in the supervision of a guide. There is no labelled dataset is present The labelled dataset is present. No previous training is provided to the learning agent. Training is provided to the algorithm so that it can predict the output. RL helps to take decisions sequentially. In Supervised learning, decisions are made when input is given.