INDUSTRIAL TRAINING
PRESENTATION
Submitted by:
Aniket Anil Bhavsar
TE B 10
Submitted to:
Prof. Smita T. Patil
CONTENTS
▪ Training Details
▪ Certificate
▪ Introduction to Machine Learning
▪ Project Description
▪ Algorithms Used
▪ Visualization
▪ Conclusion
10 May 2023 2
TRAINING DETAILS
Start Date : 10 January 2022
End Date : 10 February 2022
Mode of Training : Online
Training Company : Cognifront Private Ltd.
10 May 2023 3
CERTIFICATE
10 May 2023 4
INTRODUCTION TO MACHINE LEARNING
▪ Machine learning is a branch of Artificial Intelligence (AI) and computer
science. It focuses on the use of data and algorithms to imitate the way that
humans learn.
▪ ML explores the study and construction of algorithms that can learn from data
and make predictions on data.
▪ ML allows software applications to become more accurate at predicting
outcomes without being explicitly programmed to do so. Algorithms use
historical data as input to predict new output values.
5
10 May 2023
TYPES OF MACHINE LEARNING
10 May 2023 6
SUPERVISED LEARNING
10 May 2023 7
Fig. Supervised Learning
UNSUPERVISED LEARNING
▪ In an unsupervised learning model, the algorithm learns on an unlabelled dataset
and tries to make sense by extracting features, co-occurrence, and underlying
patterns on its own.
▪ Unsupervised learning cannot be directly applied to a regression or classification
problem because unlike supervised learning.
▪ We have the input data but no corresponding output data.
10 May 2023 8
REINFORCEMENT LEARNING
▪ Reinforcement learning is an area of Machine Learning. It is about taking suitable
action to maximize reward in a particular situation.
▪ It is employed by various software and machines to find the best possible behavior
or path it should take in a specific situation.
▪ The input should be an initial state from which the model will start
▪ There are many possible outputs as there are a variety of solutions to a particular
problem
▪ In the absence of a training dataset, it is bound to learn from its experience.
10 May 2023 9
PROJECT DESCRIPTION
▪ LOAN APPROVAL PREDICTION USING MACHINE LEARNING:
▪ A bank deals in all kinds of loans. They have a presence across all urban, semi-urban and
rural areas. The customer first applies for a loan and after that, the bank validates the
customer eligibility for the loan.
▪ If the banks wants to automate the loan eligibility process (real-time) based on customer
detail provided while filling out online application forms. These details are Gender,
Marital Status, Education, number of Dependents, Income, Loan Amount, Credit
History, and others.
10 May 2023 10
STEPS OF PREDICTIVE MODELLING
10 May 2023 11
ALGORITHMS AND LIBRARIES USED
▪ ALGORITHMS USED IN PROJECT:
▪ Support Vector Machine (SVM)
▪ KNN (K Nearest Neighbor)
▪ Decision Tree
▪ Logistic Regression
▪ Random Forest
▪ LIBRARIES USED:
▪ NumPy
▪ Pandas
▪ Matplotlib
▪ Seaborn
▪ Train_Test_Split
▪ Accuracy Score
▪ SVM
▪ All the other libraries required for ML models.
10 May 2023 12
SVM (Support Vector Machine)
▪ It is a classification method. In this algorithm, we plot each data item as a
point in n-dimensional space (where n is number of features you have) with
the value of each feature being the value of a particular coordinate.
▪ Accuracy score: 68.29%
▪ Mean absolute error: 0.3170
10 May 2023 13
KNN (K – NEAREST NEIGHBOR)
▪ It can be used for both classification and regression problems. However, it is
more widely used in classification problems in the industry. K nearest
neighbors is a simple algorithm that stores all available cases and classifies
new cases by a majority vote of its k neighbors.
▪ Accuracy score: 63.41%
▪ Mean absolute error: 0.3658
10 May 2023 14
DECISION TREE
▪ It is a type of supervised learning algorithm that is mostly used for
classification problems. Surprisingly, it works for both categorical and
continuous dependent variables. In this algorithm, we split the population into
two or more homogeneous sets. This is done based on most significant
attributes/ independent variables to make as distinct groups as possible.
▪ Accuracy score: 65.85%
▪ Mean absolute error: 0.3414
10 May 2023 15
LOGISTIC REGRESSION
▪ It is a classification algorithm. It is used to estimate discrete values (Binary
values like 0/1, yes/no, true/false) based on given set of independent
variable(s). In simple words, it predicts the probability of occurrence of an
event by fitting data to a logistic function. Hence, it is also known as logistic
regression. Since, it predicts the probability, its output values lies between 0
and 1 (as expected).
▪ Accuracy Score: 83.76%
▪ Mean absolute error: 0.1623
10 May 2023 16
RANDOM FOREST
▪ Random Forest is a trademark term for an ensemble of decision trees. In
Random Forest, we’ve collection of decision trees (so known as “Forest”). To
classify a new object based on attributes, each tree gives a classification and
we say the tree “votes” for that class. The forest chooses the classification
having the most votes (over all the trees in the forest).
▪ Accuracy score: 75.60%
▪ Mean absolute error: 0.2439
10 May 2023 17
OUTPUT
10 May 2023 18
VISUALIZATION
10 May 2023 19
COMPARISON OF MODELS
10 May 2023 20
CONCLUSION
▪ After trying and testing 5 different algorithms, the best accuracy is achieved
by Logistic Regression (83.76%) followed by Random Forest(75%), followed
by Support vector machine (68.2%), followed by Decision Tree(65%) and
KNN performed the worst (63.4%). Hereby, I learn to imply the machine
learning algorithms for prediction by using previous data or dataset for
training the model and testing it on real-time data/dataset. To make it user-
friendly I have integrated UI by which user gain better interaction experience.
10 May 2023 21
THANK YOU !!
Name of Student: Aniket A. Bhavsar
Email: bhavsaraniket110@gmail.com
Department of Computer Engineering
K. K. Wagh Institute of Engineering Education & Research, Nashik
10 May 2023 22

TE_B_10_INTERNSHIP_PPT_ANIKET_BHAVSAR.pptx

  • 1.
    INDUSTRIAL TRAINING PRESENTATION Submitted by: AniketAnil Bhavsar TE B 10 Submitted to: Prof. Smita T. Patil
  • 2.
    CONTENTS ▪ Training Details ▪Certificate ▪ Introduction to Machine Learning ▪ Project Description ▪ Algorithms Used ▪ Visualization ▪ Conclusion 10 May 2023 2
  • 3.
    TRAINING DETAILS Start Date: 10 January 2022 End Date : 10 February 2022 Mode of Training : Online Training Company : Cognifront Private Ltd. 10 May 2023 3
  • 4.
  • 5.
    INTRODUCTION TO MACHINELEARNING ▪ Machine learning is a branch of Artificial Intelligence (AI) and computer science. It focuses on the use of data and algorithms to imitate the way that humans learn. ▪ ML explores the study and construction of algorithms that can learn from data and make predictions on data. ▪ ML allows software applications to become more accurate at predicting outcomes without being explicitly programmed to do so. Algorithms use historical data as input to predict new output values. 5 10 May 2023
  • 6.
    TYPES OF MACHINELEARNING 10 May 2023 6
  • 7.
    SUPERVISED LEARNING 10 May2023 7 Fig. Supervised Learning
  • 8.
    UNSUPERVISED LEARNING ▪ Inan unsupervised learning model, the algorithm learns on an unlabelled dataset and tries to make sense by extracting features, co-occurrence, and underlying patterns on its own. ▪ Unsupervised learning cannot be directly applied to a regression or classification problem because unlike supervised learning. ▪ We have the input data but no corresponding output data. 10 May 2023 8
  • 9.
    REINFORCEMENT LEARNING ▪ Reinforcementlearning is an area of Machine Learning. It is about taking suitable action to maximize reward in a particular situation. ▪ It is employed by various software and machines to find the best possible behavior or path it should take in a specific situation. ▪ The input should be an initial state from which the model will start ▪ There are many possible outputs as there are a variety of solutions to a particular problem ▪ In the absence of a training dataset, it is bound to learn from its experience. 10 May 2023 9
  • 10.
    PROJECT DESCRIPTION ▪ LOANAPPROVAL PREDICTION USING MACHINE LEARNING: ▪ A bank deals in all kinds of loans. They have a presence across all urban, semi-urban and rural areas. The customer first applies for a loan and after that, the bank validates the customer eligibility for the loan. ▪ If the banks wants to automate the loan eligibility process (real-time) based on customer detail provided while filling out online application forms. These details are Gender, Marital Status, Education, number of Dependents, Income, Loan Amount, Credit History, and others. 10 May 2023 10
  • 11.
    STEPS OF PREDICTIVEMODELLING 10 May 2023 11
  • 12.
    ALGORITHMS AND LIBRARIESUSED ▪ ALGORITHMS USED IN PROJECT: ▪ Support Vector Machine (SVM) ▪ KNN (K Nearest Neighbor) ▪ Decision Tree ▪ Logistic Regression ▪ Random Forest ▪ LIBRARIES USED: ▪ NumPy ▪ Pandas ▪ Matplotlib ▪ Seaborn ▪ Train_Test_Split ▪ Accuracy Score ▪ SVM ▪ All the other libraries required for ML models. 10 May 2023 12
  • 13.
    SVM (Support VectorMachine) ▪ It is a classification method. In this algorithm, we plot each data item as a point in n-dimensional space (where n is number of features you have) with the value of each feature being the value of a particular coordinate. ▪ Accuracy score: 68.29% ▪ Mean absolute error: 0.3170 10 May 2023 13
  • 14.
    KNN (K –NEAREST NEIGHBOR) ▪ It can be used for both classification and regression problems. However, it is more widely used in classification problems in the industry. K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases by a majority vote of its k neighbors. ▪ Accuracy score: 63.41% ▪ Mean absolute error: 0.3658 10 May 2023 14
  • 15.
    DECISION TREE ▪ Itis a type of supervised learning algorithm that is mostly used for classification problems. Surprisingly, it works for both categorical and continuous dependent variables. In this algorithm, we split the population into two or more homogeneous sets. This is done based on most significant attributes/ independent variables to make as distinct groups as possible. ▪ Accuracy score: 65.85% ▪ Mean absolute error: 0.3414 10 May 2023 15
  • 16.
    LOGISTIC REGRESSION ▪ Itis a classification algorithm. It is used to estimate discrete values (Binary values like 0/1, yes/no, true/false) based on given set of independent variable(s). In simple words, it predicts the probability of occurrence of an event by fitting data to a logistic function. Hence, it is also known as logistic regression. Since, it predicts the probability, its output values lies between 0 and 1 (as expected). ▪ Accuracy Score: 83.76% ▪ Mean absolute error: 0.1623 10 May 2023 16
  • 17.
    RANDOM FOREST ▪ RandomForest is a trademark term for an ensemble of decision trees. In Random Forest, we’ve collection of decision trees (so known as “Forest”). To classify a new object based on attributes, each tree gives a classification and we say the tree “votes” for that class. The forest chooses the classification having the most votes (over all the trees in the forest). ▪ Accuracy score: 75.60% ▪ Mean absolute error: 0.2439 10 May 2023 17
  • 18.
  • 19.
  • 20.
  • 21.
    CONCLUSION ▪ After tryingand testing 5 different algorithms, the best accuracy is achieved by Logistic Regression (83.76%) followed by Random Forest(75%), followed by Support vector machine (68.2%), followed by Decision Tree(65%) and KNN performed the worst (63.4%). Hereby, I learn to imply the machine learning algorithms for prediction by using previous data or dataset for training the model and testing it on real-time data/dataset. To make it user- friendly I have integrated UI by which user gain better interaction experience. 10 May 2023 21
  • 22.
    THANK YOU !! Nameof Student: Aniket A. Bhavsar Email: bhavsaraniket110@gmail.com Department of Computer Engineering K. K. Wagh Institute of Engineering Education & Research, Nashik 10 May 2023 22