SlideShare a Scribd company logo
1 of 51
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Types of ML :-
 There are four types of machine learning:
1.Supervised Learning:
 Supervised Learning is the one, where you can consider the learning is
guided by a teacher. We have a dataset which acts as a teacher and its
role is to train the model or the machine. Once the model gets trained it
can start making a prediction or decision when new data is given to it.
 Supervised learning uses labelled training data to learn the mapping
function that turns input variables (X) into the output variable (Y). In
other words, it solves for f in the following equation:
Y = f (X)
 This allows us to accurately generate outputs when given new inputs.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Two types of supervised learning are: classification and regression.
 Classification is used to predict the outcome of a given sample when the
output variable is in the form of categories. A classification model might look
at the input data and try to predict labels like “sick” or “healthy.”
 Regression is used to predict the outcome of a given sample when the
output variable is in the form of real values. For example, a regression
model might process input data to predict the amount of rainfall, the height
of a person, etc.
 Ensembling is another type of supervised learning. It means combining the
predictions of multiple machine learning models that are individually weak to
produce a more accurate prediction on a new sample.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Thus, In supervised Machine Learning
 “The outcome or output for the given input is known before itself” and the
machine must be able to map or assign the given input to the output.
Multiple images of a cat, dog, orange, apple etc here the images are
labelled. It is fed into the machine for training and the machine must identify
the same. Just like a human child is shown a cat and told so, when it sees a
completely different cat among others still identifies it as a cat, the same
method is employed here. In short,Supervised Learning means – Train Me!
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
2.Unsupervised Learning:
 Unsupervised learning models are used when we only have the input
variables (X) and no corresponding output variables.
 They use unlabelled training data to model the underlying structure of the
data. Input data is given and the model is run on it. The image or the input
given are mixed together and insights on the inputs can be found .
 The model learns through observation and finds structures in the data. Once
the model is given a dataset, it automatically finds patterns and relationships
in the dataset by creating clusters in it.
 What it cannot do is add labels to the cluster, like it cannot say this a group
of apples or mangoes, but it will separate all the apples from mangoes.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Two types of unsupervised learning are:Association and Clustering
 Association is used to discover the probability of the co-occurrence of items
in a collection. It is extensively used in market-basket analysis. For example,
an association model might be used to discover that if a customer purchases
bread, s/he is 80% likely to also purchase eggs.
 Clustering is used to group samples such that objects within the same
cluster are more similar to each other than to the objects from another
cluster.
 Apriori, K-means, PCA — are examples of unsupervised learning.
 Suppose we presented images of apples, bananas and mangoes to the
model, so what it does, based on some patterns and relationships it creates
clusters and divides the dataset into those clusters. Now if a new data is fed
to the model, it adds it to one of the created clusters.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Fig: grouping of similar data
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
3.Semi-supervised Learning:
 It is in-between that of Supervised and Unsupervised Learning. Where the
combination is used to produce the desired results and it is the most
important in real-world scenarios where all the data available are a
combination of labelled and unlabelled data.
3.Reinforced Learning:
 The machine is exposed to an environment where it gets trained by trial and
error method, here it is trained to make a much specific decision. The
machine learns from past experience and tries to capture the best possible
knowledge to make accurate decisions based on the feedback received.
Algorithm allows an agent to decide the best next action based on its current
state by learning behaviours that will maximize a reward.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 It is the ability of an agent to interact with the environment and find out what
is the best outcome. It follows the concept of hit and trial method. The agent
is rewarded or penaltized with a point for a correct or a wrong answer, and
on the basis of the positive reward points gained the model trains itself.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Fig : Types of Machine Learning
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
1.Overfitting :Over fitting refers to a model that models the training data too
well.
 Over fitting happens when a model learns the detail and noise in the training
data to the extent that it negatively impacts the performance of the model on
new data. This means that the noise or random fluctuations in the training
data is picked up and learned as concepts by the model. The problem is that
these concepts do not apply to new data and negatively impact the models
ability to generalize.
 Over fitting is more likely with nonparametric and nonlinear models that have
more flexibility when learning a target function. As such, many
nonparametric machine learning algorithms also include parameters or
techniques to limit and constrain how much detail the model learns.
2.Underfitting : Under fitting refers to a model that can neither model the
training data nor generalize to new data.
 An under fit machine learning model is not a suitable model and will be
obvious as it will have poor performance on the training data.
 Under fitting is often not discussed as it is easy to detect given a good
performance metric. The remedy is to move on and try alternate machine
learning algorithms. Nevertheless, it does provide a good contrast to the
problem of over fitting.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Bias: It gives us how closeness is our predictive model’s to training data
after averaging predict value. Generally algorithm has high bias which help
them to learn fast and easy to understand but are less flexible. That looses it
ability to predict complex problem, so it fails to explain the algorithm bias.
This results in under fitting of our model.
 Getting more training data will not help much.
 Variance: It define as deviation of predictions, in simple it is the amount
which tell us when its point data value change or a different data is use how
much the predicted value will be affected for same model or for different
model respectively. Ideally, the predicted value which we predict from model
should remain same even changing from one training data-sets to another,
but if the model has high variance then model predict value are affect by
value of data-sets.
 “Signal” as the true underlying pattern that you wish to learn from the data.
 “Noise” on the other hand, refers to the irrelevant information or randomness
in a dataset.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Overfitting and Underfitting are the two main problems that occur in machine
learning and degrade the performance of the machine learning models.
 The main goal of each machine learning model is to generalize well.
Here generalization defines the ability of an ML model to provide a suitable
output by adapting the given set of unknown input. It means after providing
training on the dataset, it can produce reliable and accurate output. Hence,
the underfitting and overfitting are the two terms that need to be checked for
the performance of the model and whether the model is generalizing well or
not.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Over fitting :
 Overfitting occurs when our machine learning model tries to cover all the
data points or more than the required data points present in the given
dataset. Because of this, the model starts caching noise and inaccurate
values present in the dataset, and all these factors reduce the efficiency and
accuracy of the model. The overfitted model has low bias and high
variance.
 The chances of occurrence of overfitting increase as much we provide
training to our model. It means the more we train our model, the more
chances of occurring the overfitted model.
 Overfitting is the main problem that occurs in supervised learning.
 Example: The concept of the overfitting can be understood by the below
graph of the linear regression output:
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 In above graph, the model tries to cover all the data points present in the
scatter plot. It may look efficient, but in reality, it is not so. Because the goal
of the regression model to find the best fit line, but here we have not got any
best fit, so, it will generate the prediction errors.
 How to avoid the Overfitting in Model :
 Both overfitting and underfitting cause the degraded performance of the
machine learning model. But the main cause is overfitting, so there are
some ways by which we can reduce the occurrence of overfitting in our
model.
 Cross-Validation
 Training with more data
 Removing features
 Early stopping the training
 Regularization
 Ensembling
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Underfitting :
 Underfitting occurs when our machine learning model is not able to capture
the underlying trend of the data. To avoid the overfitting in the model, the fed
of training data can be stopped at an early stage, due to which the model
may not learn enough from the training data. As a result, it may fail to find
the best fit of the dominant trend in the data.
 In the case of underfitting, the model is not able to learn enough from the
training data, and hence it reduces the accuracy and produces unreliable
predictions.
 An underfitted model has high bias and low variance.
 Example: We can understand the underfitting using below output of the
linear regression model:
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 In above graph, the model is unable to capture the data points present in the
plot.
 How to avoid underfitting:
 By increasing the training time of the model.
 By increasing the number of features.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Goodness of Fit :
 The "Goodness of fit" term is taken from the statistics, and the goal of the
machine learning models to achieve the goodness of fit. In statistics
modeling, it defines how closely the result or predicted values match the true
values of the dataset.
 The model with a good fit is between the underfitted and overfitted model,
and ideally, it makes predictions with 0 errors, but in practice, it is difficult to
achieve it.
 There are two other methods by which we can get a good point for our
model, which are the resampling method to estimate model accuracy
and validation dataset.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Machine learning life cycle is a cyclic process to build an efficient machine
learning project. The main purpose of the life cycle is to find a solution to the
problem or project.
 Machine learning life cycle involves seven major steps, which are given
below:
 Gathering Data
 Data preparation
 Data Wrangling
 Analyze Data
 Train the model
 Test the model
 Deployment
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 In the complete life cycle process, to solve a problem, we create a machine
learning system called "model", and this model is created by providing
"training". But to train a model, we need data, hence, life cycle starts by
collecting data.
 The most important thing in the complete process is to understand the
problem and to know the purpose of the problem.
1. Gathering Data:
 Data Gathering is the first step of the machine learning life cycle. The goal of
this step is to identify and obtain all data-related problems.
 In this step, we need to identify the different data sources, as data can be
collected from various sources such as files, database, internet, or mobile
devices. It is one of the most important steps of the life cycle. The quantity
and quality of the collected data will determine the efficiency of the output.
The more will be the data, the more accurate will be the prediction.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 This step includes the below tasks:
 Identify various data sources
 Collect data
 Integrate the data obtained from different sources
 By performing the above task, we get a coherent set of data, also called as a dataset.
It will be used in further steps.
2. Data preparation :
 After collecting the data, we need to prepare it for further steps. Data preparation is a
step where we put our data into a suitable place and prepare it to use in our machine
learning training.
 In this step, first, we put all data together, and then randomize the ordering of data.
 This step can be further divided into two processes:
 Data exploration:
It is used to understand the nature of data that we have to work with. We need to
understand the characteristics, format, and quality of data.
A better understanding of data leads to an effective outcome. In this, we find
Correlations, general trends, and outliers.
 Data pre-processing:
Now the next step is preprocessing of data for its analysis.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
3. Data Wrangling :
 Data wrangling is the process of cleaning and converting raw data into a useable
format. It is the process of cleaning the data, selecting the variable to use, and
transforming the data in a proper format to make it more suitable for analysis in the
next step. It is one of the most important steps of the complete process. Cleaning of
data is required to address the quality issues.
 It is not necessary that data we have collected is always of our use as some of the
data may not be useful. In real-world applications, collected data may have various
issues, including:
 Missing Values
 Duplicate data
 Invalid data
 Noise
 So, we use various filtering techniques to clean the data.
 It is mandatory to detect and remove the above issues because it can negatively
affect the quality of the outcome.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
4. Data Analysis :
 Now the cleaned and prepared data is passed on to the analysis step. This
step involves:
 Selection of analytical techniques
 Building models
 Review the result
 The aim of this step is to build a machine learning model to analyze the data
using various analytical techniques and review the outcome. It starts with the
determination of the type of the problems, where we select the machine
learning techniques such as Classification, Regression, Cluster
analysis, Association, etc. then build the model using prepared data, and
evaluate the model.
 Hence, in this step, we take the data and use machine learning algorithms to
build the model.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
5. Train Model :
 Now the next step is to train the model, in this step we train our model to
improve its performance for better outcome of the problem.
 We use datasets to train the model using various machine learning algorithms.
Training a model is required so that it can understand the various patterns,
rules, and, features.
6. Test Model :
 Once our machine learning model has been trained on a given dataset, then
we test the model. In this step, we check for the accuracy of our model by
providing a test dataset to it.
 Testing the model determines the percentage accuracy of the model as per
the requirement of project or problem.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
7. Deployment :
 The last step of machine learning life cycle is deployment, where we deploy
the model in the real-world system.
 If the above-prepared model is producing an accurate result as per our
requirement with acceptable speed, then we deploy the model in the real
system. But before deploying the project, we will check whether it is
improving its performance using available data or not. The deployment
phase is similar to making the final report for a project.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 In machine learning classification problems, there are often too many factors
on the basis of which the final classification is done. These factors are
basically variables called features. The higher the number of features, the
harder it gets to visualize the training set and then work on it. Sometimes,
most of these features are correlated, and hence redundant. This is where
dimensionality reduction algorithms come into play. Dimensionality reduction
is the process of reducing the number of random variables under
consideration, by obtaining a set of principal variables. It can be divided into
feature selection and feature extraction.
 In Below figure, A 3-D classification problem can be hard to visualize,
whereas a 2-D one can be mapped to a simple 2 dimensional space, and a
1-D problem to a simple line. The below figure illustrates this concept, where
a 3-D feature space is split into two 1-D feature spaces, and later, if found to
be correlated, the number of features can be reduced even further.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Components of Dimensionality Reduction :
Feature selection: In this, we try to find a subset of the original set of
variables, or features, to get a smaller subset which can be used to model
the problem. It usually involves three ways:
• Filter
• Wrapper
• Embedded
Feature extraction: This reduces the data in a high dimensional space to a
lower dimension space, i.e. a space with lesser no. of dimensions.
 Methods of Dimensionality Reduction :
 The various methods used for dimensionality reduction include:
 Principal Component Analysis (PCA)
 Linear Discriminate Analysis (LDA)
 Generalized Discriminate Analysis (GDA)
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Dimensionality reduction may be both linear or non-linear, depending upon
the method used. The prime linear method, called Principal Component
Analysis, or PCA.
 Principal Component Analysis(PCA)
 This method was introduced by Karl Pearson. It works on a condition that
while the data in a higher dimensional space is mapped to data in a lower
dimension space, the variance of the data in the lower dimensional space
should be maximum.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 It involves the following steps:
 Construct the covariance matrix of the data.
 Compute the eigenvectors of this matrix.
 Eigenvectors corresponding to the largest eigenvalues are used to
reconstruct a large fraction of variance of the original data.
 Hence, we are left with a lesser number of eigenvectors, and there might
have been some data loss in the process. But, the most important variances
should be retained by the remaining eigenvectors.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Advantages of Dimensionality Reduction :
 It helps in data compression, and hence reduced storage space.
 It reduces computation time.
 It also helps remove redundant features, if any.
 Disadvantages of Dimensionality Reduction :
 It may lead to some amount of data loss.
 PCA tends to find linear correlations between variables, which is sometimes
undesirable.
 PCA fails in cases where mean and covariance are not enough to define
datasets.
 We may not know how many principal components to keep- in practice,
some thumb rules are applied.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Principal Component Analysis :
 Principal Component Analysis is an unsupervised learning algorithm that is
used for the dimensionality reduction in machine learning. It is a statistical
process that converts the observations of correlated features into a set of
linearly uncorrelated features with the help of orthogonal transformation.
These new transformed features are called the Principal Components. It is
one of the popular tools that is used for exploratory data analysis and
predictive modeling. It is a technique to draw strong patterns from the given
dataset by reducing the variances.
 PCA generally tries to find the lower-dimensional surface to project the high-
dimensional data.
 PCA works by considering the variance of each attribute because the high
attribute shows the good split between the classes, and hence it reduces the
dimensionality. Some real-world applications of PCA are image processing,
movie recommendation system, optimizing the power allocation in
various communication channels. It is a feature extraction technique, so it
contains the important variables and drops the least important variable.
The PCA algorithm is based on some mathematical concepts such as:
 Variance and Covariance
 Eigenvalues and Eigen factors
Some common terms used in PCA algorithm:
 Dimensionality: It is the number of features or variables present in the given
dataset. More easily, it is the number of columns present in the dataset.
 Correlation: It signifies that how strongly two variables are related to each other.
Such as if one changes, the other variable also gets changed. The correlation
value ranges from -1 to +1. Here, -1 occurs if variables are inversely proportional
to each other, and +1 indicates that variables are directly proportional to each
other.
 Orthogonal: It defines that variables are not correlated to each other, and hence
the correlation between the pair of variables is zero.
 Eigenvectors: If there is a square matrix M, and a non-zero vector v is given.
Then v will be eigenvector if Av is the scalar multiple of v.
 Covariance Matrix: A matrix containing the covariance between the pair of
variables is called the Covariance Matrix.
Principal Components in PCA :
 As described above, the transformed new features or the output of PCA are
the Principal Components. The number of these PCs are either equal to or
less than the original features present in the dataset. Some properties of
these principal components are given below:
 The principal component must be the linear combination of the original
features.
 These components are orthogonal, i.e., the correlation between a pair of
variables is zero.
 The importance of each component decreases when going to 1 to n, it
means the 1 PC has the most importance, and n PC will have the least
importance.
Steps for PCA algorithm :
1.Getting the dataset
Firstly, we need to take the input dataset and divide it into two subparts X
and Y, where X is the training set, and Y is the validation set.
2.Representing data into a structure
Now we will represent our dataset into a structure. Such as we will represent
the two-dimensional matrix of independent variable X. Here each row
corresponds to the data items, and the column corresponds to the Features.
The number of columns is the dimensions of the dataset.
3.Standardizing the data
In this step, we will standardize our dataset. Such as in a particular column,
the features with high variance are more important compared to the features
with lower variance.
If the importance of features is independent of the variance of the feature,
then we will divide each data item in a column with the standard deviation of
the column. Here we will name the matrix as Z.
4.Calculating the Covariance of Z
To calculate the covariance of Z, we will take the matrix Z, and will transpose it.
After transpose, we will multiply it by Z. The output matrix will be the
Covariance matrix of Z.
5.Calculating the Eigen Values and Eigen Vectors
Now we need to calculate the eigenvalues and eigenvectors for the resultant
covariance matrix Z. Eigenvectors or the covariance matrix are the directions of
the axes with high information. And the coefficients of these eigenvectors are
defined as the eigenvalues.
6. Sorting the Eigen Vectors
In this step, we will take all the eigenvalues and will sort them in decreasing
order, which means from largest to smallest. And simultaneously sort the
eigenvectors accordingly in matrix P of eigenvalues. The resultant matrix will be
named as P*.
7.Calculating the new features Or Principal Components
Here we will calculate the new features. To do this, we will multiply the P* matrix
to the Z. In the resultant matrix Z*, each observation is the linear combination of
original features. Each column of the Z* matrix is independent of each other.
8.Remove less or unimportant features from the new dataset.
The new feature set has occurred, so we will decide here what to keep and
what to remove. It means, we will only keep the relevant or important
features in the new dataset, and unimportant features will be removed out.
Applications of Principal Component Analysis :
 PCA is mainly used as the dimensionality reduction technique in various AI
applications such as computer vision, image compression, etc.
 It can also be used for finding hidden patterns if data has high dimensions.
Some fields where PCA is used are Finance, data mining, Psychology, etc.
 Evaluation metrics are tied to machine learning tasks. There are different
metrics for the tasks of classification, regression, ranking, clustering, topic
modeling, etc. Some metrics, such as precision-recall, are useful for multiple
tasks. Classification, regression, and ranking are examples of supervised
learning, which constitutes a majority of machine learning applications.
 Model Accuracy:
 Model accuracy in terms of classification models can be defined as the ratio
of correctly classified samples to the total number of samples:
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 True Positive (TP) — A true positive is an outcome where the
model correctly predicts the positive class.
 True Negative (TN)—A true negative is an outcome where the
model correctly predicts the negative class.
 False Positive (FP)—A false positive is an outcome where the
model incorrectly predicts the positive class.
 False Negative (FN)—A false negative is an outcome where the
model incorrectly predicts the negative class.
Problem Statement- Build a prediction model for hospitals to identify
whether the patient is suffering from cancer or not .
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Binary Classification Model — Predict whether the patient has cancer or
not.
 Let’s assume we have a training dataset with labels—100 cases, 10 labeled
as ‘Cancer’, 90 labeled as ‘Normal’
 Let’s try calculating the accuracy of this model on the above dataset, given
the following results:
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 In the above case let’s define the TP, TN, FP, FN:
 TP (Actual Cancer and predicted Cancer) = 1
 TN (Actual Normal and predicted Normal) = 90
 FN (Actual Cancer and predicted Normal) = 8
 FP (Actual Normal and predicted Cancer) = 1
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 So the accuracy of this model is 91%. But the question remains as to
whether this model is useful, even being so accurate?
 This highly accurate model may not be useful, as it isn’t able to predict the
actual cancer patients—hence, this can have worst consequences.
 So for these types of scenarios how do we can trust the machine learning
models?
 Accuracy alone doesn’t tell the full story when we’re working with a class-
imbalanced dataset like this one, where there’s a significant disparity
between the number of positive and negative labels.
 Precision and Recall :
 In a classification task, the precision for a class is the number of true
positives (i.e. the number of items correctly labeled as belonging to the
positive class) divided by the total number of elements labeled as belonging
to the positive class (i.e. the sum of true positives and false positives, which
are items incorrectly labeled as belonging to the class).
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Recall is defined as the number of true positives divided by the total number
of elements that actually belong to the positive class (i.e. the sum of true
positives and false negatives, which are items which were not labeled as
belonging to the positive class but should have been).
High precision means that an algorithm returned substantially more
relevant results than irrelevant ones.
High recall means that an algorithm returned most of the relevant results
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
 Let’s try to measure precision and recall for our cancer prediction use case:
Our model has a precision value of 0.5 — in other words, when it predicts
cancer, it’s correct 50% of the time.
Our model has a recall value of 0.11 — in other words, it correctly
identifies only 11% of all cancer patients.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Classification Accuracy :
 Classification Accuracy is what we usually mean, when we use the term
accuracy. It is the ratio of number of correct predictions to the total number
of input samples.
 It works well only if there are equal number of samples belonging to each
class.
 For example, consider that there are 98% samples of class A and 2%
samples of class B in our training set. Then our model can easily get 98%
training accuracy by simply predicting every training sample belonging to
class A.
 When the same model is tested on a test set with 60% samples of class A
and 40% samples of class B, then the test accuracy would drop down to
60%. Classification Accuracy is great, but gives us the false sense of
achieving high accuracy.
 The real problem arises, when the cost of misclassification of the minor
class samples are very high. If we deal with a rare but fatal disease, the cost
of failing to diagnose the disease of a sick person is much higher than the
cost of sending a healthy person to more tests.
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
Thanks !!!
Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune

More Related Content

Similar to Intro of Machine Learning Models .pptx

Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Ankit Gupta
 
Introduction to Machine Learning.pptx
Introduction  to  Machine  Learning.pptxIntroduction  to  Machine  Learning.pptx
Introduction to Machine Learning.pptxHarsha Patel
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsVidya sagar Sharma
 
INTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptxINTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptxsrikanthkallem1
 
OPTIMIZE TO ACTUALIZE: THE IMPACT OF HYPERPARAMETER TUNING ON AI
OPTIMIZE TO ACTUALIZE: THE IMPACT OF HYPERPARAMETER TUNING ON AIOPTIMIZE TO ACTUALIZE: THE IMPACT OF HYPERPARAMETER TUNING ON AI
OPTIMIZE TO ACTUALIZE: THE IMPACT OF HYPERPARAMETER TUNING ON AIChristopherTHyatt
 
A Review on Introduction to Reinforcement Learning
A Review on Introduction to Reinforcement LearningA Review on Introduction to Reinforcement Learning
A Review on Introduction to Reinforcement Learningijtsrd
 
Types of machine learning
Types of machine learningTypes of machine learning
Types of machine learningHimaniAloona
 
An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine LearningVedaj Padman
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfPranavPatil822557
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfAnanthReddy38
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applicationsBenjaminlapid1
 
Training_Report_on_Machine_Learning.docx
Training_Report_on_Machine_Learning.docxTraining_Report_on_Machine_Learning.docx
Training_Report_on_Machine_Learning.docxShubhamBishnoi14
 

Similar to Intro of Machine Learning Models .pptx (20)

Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2Intro/Overview on Machine Learning Presentation -2
Intro/Overview on Machine Learning Presentation -2
 
Introduction to Machine Learning.pptx
Introduction  to  Machine  Learning.pptxIntroduction  to  Machine  Learning.pptx
Introduction to Machine Learning.pptx
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
 
INTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptxINTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptx
 
OPTIMIZE TO ACTUALIZE: THE IMPACT OF HYPERPARAMETER TUNING ON AI
OPTIMIZE TO ACTUALIZE: THE IMPACT OF HYPERPARAMETER TUNING ON AIOPTIMIZE TO ACTUALIZE: THE IMPACT OF HYPERPARAMETER TUNING ON AI
OPTIMIZE TO ACTUALIZE: THE IMPACT OF HYPERPARAMETER TUNING ON AI
 
Machine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdfMachine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdf
 
Stella esei
Stella eseiStella esei
Stella esei
 
A Review on Introduction to Reinforcement Learning
A Review on Introduction to Reinforcement LearningA Review on Introduction to Reinforcement Learning
A Review on Introduction to Reinforcement Learning
 
Types of machine learning
Types of machine learningTypes of machine learning
Types of machine learning
 
An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine Learning
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
AI PROJECT CYCLE.pptx
AI PROJECT CYCLE.pptxAI PROJECT CYCLE.pptx
AI PROJECT CYCLE.pptx
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
Training_Report_on_Machine_Learning.docx
Training_Report_on_Machine_Learning.docxTraining_Report_on_Machine_Learning.docx
Training_Report_on_Machine_Learning.docx
 

More from Harsha Patel

Introduction to Reinforcement Learning.pptx
Introduction to Reinforcement Learning.pptxIntroduction to Reinforcement Learning.pptx
Introduction to Reinforcement Learning.pptxHarsha Patel
 
Introduction to Association Rules.pptx
Introduction  to  Association  Rules.pptxIntroduction  to  Association  Rules.pptx
Introduction to Association Rules.pptxHarsha Patel
 
Introduction to Clustering . pptx
Introduction    to     Clustering . pptxIntroduction    to     Clustering . pptx
Introduction to Clustering . pptxHarsha Patel
 
Introduction to Classification . pptx
Introduction  to   Classification . pptxIntroduction  to   Classification . pptx
Introduction to Classification . pptxHarsha Patel
 
Introduction to Regression . pptx
Introduction     to    Regression . pptxIntroduction     to    Regression . pptx
Introduction to Regression . pptxHarsha Patel
 
Unit-V-Introduction to Data Mining.pptx
Unit-V-Introduction to  Data Mining.pptxUnit-V-Introduction to  Data Mining.pptx
Unit-V-Introduction to Data Mining.pptxHarsha Patel
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxHarsha Patel
 
Unit-III-AI Search Techniques and solution's
Unit-III-AI Search Techniques and solution'sUnit-III-AI Search Techniques and solution's
Unit-III-AI Search Techniques and solution'sHarsha Patel
 
Unit-II-Introduction of Artifiial Intelligence.pptx
Unit-II-Introduction of Artifiial Intelligence.pptxUnit-II-Introduction of Artifiial Intelligence.pptx
Unit-II-Introduction of Artifiial Intelligence.pptxHarsha Patel
 
Unit-I-Introduction to Recent Trends.pptx
Unit-I-Introduction to Recent Trends.pptxUnit-I-Introduction to Recent Trends.pptx
Unit-I-Introduction to Recent Trends.pptxHarsha Patel
 
Using Unix Commands.pptx
Using Unix Commands.pptxUsing Unix Commands.pptx
Using Unix Commands.pptxHarsha Patel
 
Using Vi Editor.pptx
Using Vi Editor.pptxUsing Vi Editor.pptx
Using Vi Editor.pptxHarsha Patel
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptxShell Scripting and Programming.pptx
Shell Scripting and Programming.pptxHarsha Patel
 
Managing Processes in Unix.pptx
Managing Processes in Unix.pptxManaging Processes in Unix.pptx
Managing Processes in Unix.pptxHarsha Patel
 
Introduction to Unix Concets.pptx
Introduction to Unix Concets.pptxIntroduction to Unix Concets.pptx
Introduction to Unix Concets.pptxHarsha Patel
 
Handling Files Under Unix.pptx
Handling Files Under Unix.pptxHandling Files Under Unix.pptx
Handling Files Under Unix.pptxHarsha Patel
 
Introduction to OS.pptx
Introduction to OS.pptxIntroduction to OS.pptx
Introduction to OS.pptxHarsha Patel
 
Using Unix Commands.pptx
Using Unix Commands.pptxUsing Unix Commands.pptx
Using Unix Commands.pptxHarsha Patel
 
Introduction to Unix Concets.pptx
Introduction to Unix Concets.pptxIntroduction to Unix Concets.pptx
Introduction to Unix Concets.pptxHarsha Patel
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptxShell Scripting and Programming.pptx
Shell Scripting and Programming.pptxHarsha Patel
 

More from Harsha Patel (20)

Introduction to Reinforcement Learning.pptx
Introduction to Reinforcement Learning.pptxIntroduction to Reinforcement Learning.pptx
Introduction to Reinforcement Learning.pptx
 
Introduction to Association Rules.pptx
Introduction  to  Association  Rules.pptxIntroduction  to  Association  Rules.pptx
Introduction to Association Rules.pptx
 
Introduction to Clustering . pptx
Introduction    to     Clustering . pptxIntroduction    to     Clustering . pptx
Introduction to Clustering . pptx
 
Introduction to Classification . pptx
Introduction  to   Classification . pptxIntroduction  to   Classification . pptx
Introduction to Classification . pptx
 
Introduction to Regression . pptx
Introduction     to    Regression . pptxIntroduction     to    Regression . pptx
Introduction to Regression . pptx
 
Unit-V-Introduction to Data Mining.pptx
Unit-V-Introduction to  Data Mining.pptxUnit-V-Introduction to  Data Mining.pptx
Unit-V-Introduction to Data Mining.pptx
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptx
 
Unit-III-AI Search Techniques and solution's
Unit-III-AI Search Techniques and solution'sUnit-III-AI Search Techniques and solution's
Unit-III-AI Search Techniques and solution's
 
Unit-II-Introduction of Artifiial Intelligence.pptx
Unit-II-Introduction of Artifiial Intelligence.pptxUnit-II-Introduction of Artifiial Intelligence.pptx
Unit-II-Introduction of Artifiial Intelligence.pptx
 
Unit-I-Introduction to Recent Trends.pptx
Unit-I-Introduction to Recent Trends.pptxUnit-I-Introduction to Recent Trends.pptx
Unit-I-Introduction to Recent Trends.pptx
 
Using Unix Commands.pptx
Using Unix Commands.pptxUsing Unix Commands.pptx
Using Unix Commands.pptx
 
Using Vi Editor.pptx
Using Vi Editor.pptxUsing Vi Editor.pptx
Using Vi Editor.pptx
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptxShell Scripting and Programming.pptx
Shell Scripting and Programming.pptx
 
Managing Processes in Unix.pptx
Managing Processes in Unix.pptxManaging Processes in Unix.pptx
Managing Processes in Unix.pptx
 
Introduction to Unix Concets.pptx
Introduction to Unix Concets.pptxIntroduction to Unix Concets.pptx
Introduction to Unix Concets.pptx
 
Handling Files Under Unix.pptx
Handling Files Under Unix.pptxHandling Files Under Unix.pptx
Handling Files Under Unix.pptx
 
Introduction to OS.pptx
Introduction to OS.pptxIntroduction to OS.pptx
Introduction to OS.pptx
 
Using Unix Commands.pptx
Using Unix Commands.pptxUsing Unix Commands.pptx
Using Unix Commands.pptx
 
Introduction to Unix Concets.pptx
Introduction to Unix Concets.pptxIntroduction to Unix Concets.pptx
Introduction to Unix Concets.pptx
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptxShell Scripting and Programming.pptx
Shell Scripting and Programming.pptx
 

Recently uploaded

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 

Intro of Machine Learning Models .pptx

  • 1. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 2. Types of ML :-  There are four types of machine learning: 1.Supervised Learning:  Supervised Learning is the one, where you can consider the learning is guided by a teacher. We have a dataset which acts as a teacher and its role is to train the model or the machine. Once the model gets trained it can start making a prediction or decision when new data is given to it.  Supervised learning uses labelled training data to learn the mapping function that turns input variables (X) into the output variable (Y). In other words, it solves for f in the following equation: Y = f (X)  This allows us to accurately generate outputs when given new inputs. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 3.  Two types of supervised learning are: classification and regression.  Classification is used to predict the outcome of a given sample when the output variable is in the form of categories. A classification model might look at the input data and try to predict labels like “sick” or “healthy.”  Regression is used to predict the outcome of a given sample when the output variable is in the form of real values. For example, a regression model might process input data to predict the amount of rainfall, the height of a person, etc.  Ensembling is another type of supervised learning. It means combining the predictions of multiple machine learning models that are individually weak to produce a more accurate prediction on a new sample. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 4.  Thus, In supervised Machine Learning  “The outcome or output for the given input is known before itself” and the machine must be able to map or assign the given input to the output. Multiple images of a cat, dog, orange, apple etc here the images are labelled. It is fed into the machine for training and the machine must identify the same. Just like a human child is shown a cat and told so, when it sees a completely different cat among others still identifies it as a cat, the same method is employed here. In short,Supervised Learning means – Train Me! Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 5. 2.Unsupervised Learning:  Unsupervised learning models are used when we only have the input variables (X) and no corresponding output variables.  They use unlabelled training data to model the underlying structure of the data. Input data is given and the model is run on it. The image or the input given are mixed together and insights on the inputs can be found .  The model learns through observation and finds structures in the data. Once the model is given a dataset, it automatically finds patterns and relationships in the dataset by creating clusters in it.  What it cannot do is add labels to the cluster, like it cannot say this a group of apples or mangoes, but it will separate all the apples from mangoes. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 6.  Two types of unsupervised learning are:Association and Clustering  Association is used to discover the probability of the co-occurrence of items in a collection. It is extensively used in market-basket analysis. For example, an association model might be used to discover that if a customer purchases bread, s/he is 80% likely to also purchase eggs.  Clustering is used to group samples such that objects within the same cluster are more similar to each other than to the objects from another cluster.  Apriori, K-means, PCA — are examples of unsupervised learning.  Suppose we presented images of apples, bananas and mangoes to the model, so what it does, based on some patterns and relationships it creates clusters and divides the dataset into those clusters. Now if a new data is fed to the model, it adds it to one of the created clusters. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 7. Fig: grouping of similar data Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 8. 3.Semi-supervised Learning:  It is in-between that of Supervised and Unsupervised Learning. Where the combination is used to produce the desired results and it is the most important in real-world scenarios where all the data available are a combination of labelled and unlabelled data. 3.Reinforced Learning:  The machine is exposed to an environment where it gets trained by trial and error method, here it is trained to make a much specific decision. The machine learns from past experience and tries to capture the best possible knowledge to make accurate decisions based on the feedback received. Algorithm allows an agent to decide the best next action based on its current state by learning behaviours that will maximize a reward. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 9.  It is the ability of an agent to interact with the environment and find out what is the best outcome. It follows the concept of hit and trial method. The agent is rewarded or penaltized with a point for a correct or a wrong answer, and on the basis of the positive reward points gained the model trains itself. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 10. Fig : Types of Machine Learning Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 11. 1.Overfitting :Over fitting refers to a model that models the training data too well.  Over fitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. This means that the noise or random fluctuations in the training data is picked up and learned as concepts by the model. The problem is that these concepts do not apply to new data and negatively impact the models ability to generalize.  Over fitting is more likely with nonparametric and nonlinear models that have more flexibility when learning a target function. As such, many nonparametric machine learning algorithms also include parameters or techniques to limit and constrain how much detail the model learns. 2.Underfitting : Under fitting refers to a model that can neither model the training data nor generalize to new data.  An under fit machine learning model is not a suitable model and will be obvious as it will have poor performance on the training data.  Under fitting is often not discussed as it is easy to detect given a good performance metric. The remedy is to move on and try alternate machine learning algorithms. Nevertheless, it does provide a good contrast to the problem of over fitting. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 12.  Bias: It gives us how closeness is our predictive model’s to training data after averaging predict value. Generally algorithm has high bias which help them to learn fast and easy to understand but are less flexible. That looses it ability to predict complex problem, so it fails to explain the algorithm bias. This results in under fitting of our model.  Getting more training data will not help much.  Variance: It define as deviation of predictions, in simple it is the amount which tell us when its point data value change or a different data is use how much the predicted value will be affected for same model or for different model respectively. Ideally, the predicted value which we predict from model should remain same even changing from one training data-sets to another, but if the model has high variance then model predict value are affect by value of data-sets.  “Signal” as the true underlying pattern that you wish to learn from the data.  “Noise” on the other hand, refers to the irrelevant information or randomness in a dataset. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 13.  Overfitting and Underfitting are the two main problems that occur in machine learning and degrade the performance of the machine learning models.  The main goal of each machine learning model is to generalize well. Here generalization defines the ability of an ML model to provide a suitable output by adapting the given set of unknown input. It means after providing training on the dataset, it can produce reliable and accurate output. Hence, the underfitting and overfitting are the two terms that need to be checked for the performance of the model and whether the model is generalizing well or not. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 14.  Over fitting :  Overfitting occurs when our machine learning model tries to cover all the data points or more than the required data points present in the given dataset. Because of this, the model starts caching noise and inaccurate values present in the dataset, and all these factors reduce the efficiency and accuracy of the model. The overfitted model has low bias and high variance.  The chances of occurrence of overfitting increase as much we provide training to our model. It means the more we train our model, the more chances of occurring the overfitted model.  Overfitting is the main problem that occurs in supervised learning.  Example: The concept of the overfitting can be understood by the below graph of the linear regression output: Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 15. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 16.  In above graph, the model tries to cover all the data points present in the scatter plot. It may look efficient, but in reality, it is not so. Because the goal of the regression model to find the best fit line, but here we have not got any best fit, so, it will generate the prediction errors.  How to avoid the Overfitting in Model :  Both overfitting and underfitting cause the degraded performance of the machine learning model. But the main cause is overfitting, so there are some ways by which we can reduce the occurrence of overfitting in our model.  Cross-Validation  Training with more data  Removing features  Early stopping the training  Regularization  Ensembling Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 17.  Underfitting :  Underfitting occurs when our machine learning model is not able to capture the underlying trend of the data. To avoid the overfitting in the model, the fed of training data can be stopped at an early stage, due to which the model may not learn enough from the training data. As a result, it may fail to find the best fit of the dominant trend in the data.  In the case of underfitting, the model is not able to learn enough from the training data, and hence it reduces the accuracy and produces unreliable predictions.  An underfitted model has high bias and low variance.  Example: We can understand the underfitting using below output of the linear regression model: Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 18. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 19.  In above graph, the model is unable to capture the data points present in the plot.  How to avoid underfitting:  By increasing the training time of the model.  By increasing the number of features. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 20.  Goodness of Fit :  The "Goodness of fit" term is taken from the statistics, and the goal of the machine learning models to achieve the goodness of fit. In statistics modeling, it defines how closely the result or predicted values match the true values of the dataset.  The model with a good fit is between the underfitted and overfitted model, and ideally, it makes predictions with 0 errors, but in practice, it is difficult to achieve it.  There are two other methods by which we can get a good point for our model, which are the resampling method to estimate model accuracy and validation dataset. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 21.  Machine learning life cycle is a cyclic process to build an efficient machine learning project. The main purpose of the life cycle is to find a solution to the problem or project.  Machine learning life cycle involves seven major steps, which are given below:  Gathering Data  Data preparation  Data Wrangling  Analyze Data  Train the model  Test the model  Deployment Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 22. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 23.  In the complete life cycle process, to solve a problem, we create a machine learning system called "model", and this model is created by providing "training". But to train a model, we need data, hence, life cycle starts by collecting data.  The most important thing in the complete process is to understand the problem and to know the purpose of the problem. 1. Gathering Data:  Data Gathering is the first step of the machine learning life cycle. The goal of this step is to identify and obtain all data-related problems.  In this step, we need to identify the different data sources, as data can be collected from various sources such as files, database, internet, or mobile devices. It is one of the most important steps of the life cycle. The quantity and quality of the collected data will determine the efficiency of the output. The more will be the data, the more accurate will be the prediction. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 24.  This step includes the below tasks:  Identify various data sources  Collect data  Integrate the data obtained from different sources  By performing the above task, we get a coherent set of data, also called as a dataset. It will be used in further steps. 2. Data preparation :  After collecting the data, we need to prepare it for further steps. Data preparation is a step where we put our data into a suitable place and prepare it to use in our machine learning training.  In this step, first, we put all data together, and then randomize the ordering of data.  This step can be further divided into two processes:  Data exploration: It is used to understand the nature of data that we have to work with. We need to understand the characteristics, format, and quality of data. A better understanding of data leads to an effective outcome. In this, we find Correlations, general trends, and outliers.  Data pre-processing: Now the next step is preprocessing of data for its analysis. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 25. 3. Data Wrangling :  Data wrangling is the process of cleaning and converting raw data into a useable format. It is the process of cleaning the data, selecting the variable to use, and transforming the data in a proper format to make it more suitable for analysis in the next step. It is one of the most important steps of the complete process. Cleaning of data is required to address the quality issues.  It is not necessary that data we have collected is always of our use as some of the data may not be useful. In real-world applications, collected data may have various issues, including:  Missing Values  Duplicate data  Invalid data  Noise  So, we use various filtering techniques to clean the data.  It is mandatory to detect and remove the above issues because it can negatively affect the quality of the outcome. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 26. 4. Data Analysis :  Now the cleaned and prepared data is passed on to the analysis step. This step involves:  Selection of analytical techniques  Building models  Review the result  The aim of this step is to build a machine learning model to analyze the data using various analytical techniques and review the outcome. It starts with the determination of the type of the problems, where we select the machine learning techniques such as Classification, Regression, Cluster analysis, Association, etc. then build the model using prepared data, and evaluate the model.  Hence, in this step, we take the data and use machine learning algorithms to build the model. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 27. 5. Train Model :  Now the next step is to train the model, in this step we train our model to improve its performance for better outcome of the problem.  We use datasets to train the model using various machine learning algorithms. Training a model is required so that it can understand the various patterns, rules, and, features. 6. Test Model :  Once our machine learning model has been trained on a given dataset, then we test the model. In this step, we check for the accuracy of our model by providing a test dataset to it.  Testing the model determines the percentage accuracy of the model as per the requirement of project or problem. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 28. 7. Deployment :  The last step of machine learning life cycle is deployment, where we deploy the model in the real-world system.  If the above-prepared model is producing an accurate result as per our requirement with acceptable speed, then we deploy the model in the real system. But before deploying the project, we will check whether it is improving its performance using available data or not. The deployment phase is similar to making the final report for a project. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 29.  In machine learning classification problems, there are often too many factors on the basis of which the final classification is done. These factors are basically variables called features. The higher the number of features, the harder it gets to visualize the training set and then work on it. Sometimes, most of these features are correlated, and hence redundant. This is where dimensionality reduction algorithms come into play. Dimensionality reduction is the process of reducing the number of random variables under consideration, by obtaining a set of principal variables. It can be divided into feature selection and feature extraction.  In Below figure, A 3-D classification problem can be hard to visualize, whereas a 2-D one can be mapped to a simple 2 dimensional space, and a 1-D problem to a simple line. The below figure illustrates this concept, where a 3-D feature space is split into two 1-D feature spaces, and later, if found to be correlated, the number of features can be reduced even further. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 30. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 31.  Components of Dimensionality Reduction : Feature selection: In this, we try to find a subset of the original set of variables, or features, to get a smaller subset which can be used to model the problem. It usually involves three ways: • Filter • Wrapper • Embedded Feature extraction: This reduces the data in a high dimensional space to a lower dimension space, i.e. a space with lesser no. of dimensions.  Methods of Dimensionality Reduction :  The various methods used for dimensionality reduction include:  Principal Component Analysis (PCA)  Linear Discriminate Analysis (LDA)  Generalized Discriminate Analysis (GDA) Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 32.  Dimensionality reduction may be both linear or non-linear, depending upon the method used. The prime linear method, called Principal Component Analysis, or PCA.  Principal Component Analysis(PCA)  This method was introduced by Karl Pearson. It works on a condition that while the data in a higher dimensional space is mapped to data in a lower dimension space, the variance of the data in the lower dimensional space should be maximum. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 33.  It involves the following steps:  Construct the covariance matrix of the data.  Compute the eigenvectors of this matrix.  Eigenvectors corresponding to the largest eigenvalues are used to reconstruct a large fraction of variance of the original data.  Hence, we are left with a lesser number of eigenvectors, and there might have been some data loss in the process. But, the most important variances should be retained by the remaining eigenvectors. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 34.  Advantages of Dimensionality Reduction :  It helps in data compression, and hence reduced storage space.  It reduces computation time.  It also helps remove redundant features, if any.  Disadvantages of Dimensionality Reduction :  It may lead to some amount of data loss.  PCA tends to find linear correlations between variables, which is sometimes undesirable.  PCA fails in cases where mean and covariance are not enough to define datasets.  We may not know how many principal components to keep- in practice, some thumb rules are applied. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 35. Principal Component Analysis :  Principal Component Analysis is an unsupervised learning algorithm that is used for the dimensionality reduction in machine learning. It is a statistical process that converts the observations of correlated features into a set of linearly uncorrelated features with the help of orthogonal transformation. These new transformed features are called the Principal Components. It is one of the popular tools that is used for exploratory data analysis and predictive modeling. It is a technique to draw strong patterns from the given dataset by reducing the variances.  PCA generally tries to find the lower-dimensional surface to project the high- dimensional data.  PCA works by considering the variance of each attribute because the high attribute shows the good split between the classes, and hence it reduces the dimensionality. Some real-world applications of PCA are image processing, movie recommendation system, optimizing the power allocation in various communication channels. It is a feature extraction technique, so it contains the important variables and drops the least important variable.
  • 36. The PCA algorithm is based on some mathematical concepts such as:  Variance and Covariance  Eigenvalues and Eigen factors Some common terms used in PCA algorithm:  Dimensionality: It is the number of features or variables present in the given dataset. More easily, it is the number of columns present in the dataset.  Correlation: It signifies that how strongly two variables are related to each other. Such as if one changes, the other variable also gets changed. The correlation value ranges from -1 to +1. Here, -1 occurs if variables are inversely proportional to each other, and +1 indicates that variables are directly proportional to each other.  Orthogonal: It defines that variables are not correlated to each other, and hence the correlation between the pair of variables is zero.  Eigenvectors: If there is a square matrix M, and a non-zero vector v is given. Then v will be eigenvector if Av is the scalar multiple of v.  Covariance Matrix: A matrix containing the covariance between the pair of variables is called the Covariance Matrix.
  • 37. Principal Components in PCA :  As described above, the transformed new features or the output of PCA are the Principal Components. The number of these PCs are either equal to or less than the original features present in the dataset. Some properties of these principal components are given below:  The principal component must be the linear combination of the original features.  These components are orthogonal, i.e., the correlation between a pair of variables is zero.  The importance of each component decreases when going to 1 to n, it means the 1 PC has the most importance, and n PC will have the least importance.
  • 38. Steps for PCA algorithm : 1.Getting the dataset Firstly, we need to take the input dataset and divide it into two subparts X and Y, where X is the training set, and Y is the validation set. 2.Representing data into a structure Now we will represent our dataset into a structure. Such as we will represent the two-dimensional matrix of independent variable X. Here each row corresponds to the data items, and the column corresponds to the Features. The number of columns is the dimensions of the dataset. 3.Standardizing the data In this step, we will standardize our dataset. Such as in a particular column, the features with high variance are more important compared to the features with lower variance. If the importance of features is independent of the variance of the feature, then we will divide each data item in a column with the standard deviation of the column. Here we will name the matrix as Z.
  • 39. 4.Calculating the Covariance of Z To calculate the covariance of Z, we will take the matrix Z, and will transpose it. After transpose, we will multiply it by Z. The output matrix will be the Covariance matrix of Z. 5.Calculating the Eigen Values and Eigen Vectors Now we need to calculate the eigenvalues and eigenvectors for the resultant covariance matrix Z. Eigenvectors or the covariance matrix are the directions of the axes with high information. And the coefficients of these eigenvectors are defined as the eigenvalues. 6. Sorting the Eigen Vectors In this step, we will take all the eigenvalues and will sort them in decreasing order, which means from largest to smallest. And simultaneously sort the eigenvectors accordingly in matrix P of eigenvalues. The resultant matrix will be named as P*. 7.Calculating the new features Or Principal Components Here we will calculate the new features. To do this, we will multiply the P* matrix to the Z. In the resultant matrix Z*, each observation is the linear combination of original features. Each column of the Z* matrix is independent of each other.
  • 40. 8.Remove less or unimportant features from the new dataset. The new feature set has occurred, so we will decide here what to keep and what to remove. It means, we will only keep the relevant or important features in the new dataset, and unimportant features will be removed out. Applications of Principal Component Analysis :  PCA is mainly used as the dimensionality reduction technique in various AI applications such as computer vision, image compression, etc.  It can also be used for finding hidden patterns if data has high dimensions. Some fields where PCA is used are Finance, data mining, Psychology, etc.
  • 41.  Evaluation metrics are tied to machine learning tasks. There are different metrics for the tasks of classification, regression, ranking, clustering, topic modeling, etc. Some metrics, such as precision-recall, are useful for multiple tasks. Classification, regression, and ranking are examples of supervised learning, which constitutes a majority of machine learning applications.  Model Accuracy:  Model accuracy in terms of classification models can be defined as the ratio of correctly classified samples to the total number of samples: Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 42. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 43.  True Positive (TP) — A true positive is an outcome where the model correctly predicts the positive class.  True Negative (TN)—A true negative is an outcome where the model correctly predicts the negative class.  False Positive (FP)—A false positive is an outcome where the model incorrectly predicts the positive class.  False Negative (FN)—A false negative is an outcome where the model incorrectly predicts the negative class. Problem Statement- Build a prediction model for hospitals to identify whether the patient is suffering from cancer or not . Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 44.  Binary Classification Model — Predict whether the patient has cancer or not.  Let’s assume we have a training dataset with labels—100 cases, 10 labeled as ‘Cancer’, 90 labeled as ‘Normal’  Let’s try calculating the accuracy of this model on the above dataset, given the following results: Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 45.  In the above case let’s define the TP, TN, FP, FN:  TP (Actual Cancer and predicted Cancer) = 1  TN (Actual Normal and predicted Normal) = 90  FN (Actual Cancer and predicted Normal) = 8  FP (Actual Normal and predicted Cancer) = 1 Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 46.  So the accuracy of this model is 91%. But the question remains as to whether this model is useful, even being so accurate?  This highly accurate model may not be useful, as it isn’t able to predict the actual cancer patients—hence, this can have worst consequences.  So for these types of scenarios how do we can trust the machine learning models?  Accuracy alone doesn’t tell the full story when we’re working with a class- imbalanced dataset like this one, where there’s a significant disparity between the number of positive and negative labels.  Precision and Recall :  In a classification task, the precision for a class is the number of true positives (i.e. the number of items correctly labeled as belonging to the positive class) divided by the total number of elements labeled as belonging to the positive class (i.e. the sum of true positives and false positives, which are items incorrectly labeled as belonging to the class). Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 47. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 48.  Recall is defined as the number of true positives divided by the total number of elements that actually belong to the positive class (i.e. the sum of true positives and false negatives, which are items which were not labeled as belonging to the positive class but should have been). High precision means that an algorithm returned substantially more relevant results than irrelevant ones. High recall means that an algorithm returned most of the relevant results Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 49.  Let’s try to measure precision and recall for our cancer prediction use case: Our model has a precision value of 0.5 — in other words, when it predicts cancer, it’s correct 50% of the time. Our model has a recall value of 0.11 — in other words, it correctly identifies only 11% of all cancer patients. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 50. Classification Accuracy :  Classification Accuracy is what we usually mean, when we use the term accuracy. It is the ratio of number of correct predictions to the total number of input samples.  It works well only if there are equal number of samples belonging to each class.  For example, consider that there are 98% samples of class A and 2% samples of class B in our training set. Then our model can easily get 98% training accuracy by simply predicting every training sample belonging to class A.  When the same model is tested on a test set with 60% samples of class A and 40% samples of class B, then the test accuracy would drop down to 60%. Classification Accuracy is great, but gives us the false sense of achieving high accuracy.  The real problem arises, when the cost of misclassification of the minor class samples are very high. If we deal with a rare but fatal disease, the cost of failing to diagnose the disease of a sick person is much higher than the cost of sending a healthy person to more tests. Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune
  • 51. Thanks !!! Mrs.Harsha Patil,Dr.D.Y.Patil ACS College,Pimpri,Pune