Unit IV Introduction to ML
Prof. Rahul Navale
Department of AI & AIML
GH Raisoni College of Engineering and
Management Pune
Agenda
 Introduction to Machine Learning
 Learning Types
 ML Life Cycle
 Dataset for ML
 Data Pre-processing
 Training versus Testing
 Cross-Validation
What is Machine Learning?
• Definition: In Simple words ML is the study of
computer algorithms. With the help of ML algorithms
we can take some decision or make some
predictions.
• Examples
• Voice assistants
• Product recommendations
• Predictive analytics
• Image recognition
What is Machine Learning?
• Definition: In Simple words ML is the study of
computer algorithms. With the help of ML algorithms
we can take some decision or make some
predictions.
• Examples
• Voice assistants
• Product recommendations
• Predictive analytics
• Image recognition
Machine Learning?
Types of Machine Learning Algorithms
• • Supervised Learning:
– Trained using labeled data.
– Linear Regression: Used for predicting continuous
outcomes.
– It models the relationship between a dependent
variable and one or more independent variables by
fitting a linear equation to observed data.
– Logistic Regression: Used for binary classification
tasks (e.g., predicting yes/no outcomes).
– It estimates probabilities using a logistic function.
Types of Machine Learning Algorithms
Types of Machine Learning Algorithms
• • Unsupervised Learning:
– Dataset without labeled responses.
– Clustering: Algorithms like K-means, hierarchical
clustering, and DBSCAN group a set of objects in
such a way that objects in the same group are
more similar to each other than to those in other
groups.
Types of Machine Learning Algorithms
Types of Machine Learning Algorithms
• • Semi-Supervised Learning:
– Semi-supervised learning is an ML
approach that trains models using a
combination of a small amount of labeled
data and a large amount of unlabeled data.
ML Life Cycle
ML Implementation General Steps
Dataset for ML
• “Data really powers everything that we do.”
— Jeff Weiner.
• https://
towardsdatascience.com/top-sources-for-mac
hine-learning-datasets-bb6d0dc3378b
Data Pre-Processing
• At the heart of Machine Learning is to process data.
1. Importing the required Libraries
2. Importing the data set
3. Handling the Missing Data.
4. Encoding Categorical Data.
5. Splitting the data set into test set and training set.
6. Feature Scaling.
Data Pre-Processing
Training Vs. Testing Data
• In machine learning, datasets are typically split into
two subsets: training and testing data.
• The training data is used to train the machine
learning algorithm.
• The testing data is used to evaluate the accuracy of
the trained algorithm.
• Nearly 70% of the whole dataset will be used as a
training set and the remaining 30% will be used as a
validation set.
Training Vs. Testing Data
Training Vs. Testing
Cross-Validation
• Cross-validation is a statistical technique used to
assess the performance of a machine learning (ML)
model.
• It involves training and evaluating ML models on
subsets of a dataset, and then repeating the process
with different subsets.
• This helps to ensure that the model is trained and
tested on new data at each step.
• The results of each iteration are averaged to
calculate the cross-validation accuracy.
Cross-Validation
Thank You…

Introduction to Machine Learning Learning Types ML Life Cycle Dataset for ML Data Pre-processing Training versus Testing Cross-Validation

  • 1.
    Unit IV Introductionto ML Prof. Rahul Navale Department of AI & AIML GH Raisoni College of Engineering and Management Pune
  • 2.
    Agenda  Introduction toMachine Learning  Learning Types  ML Life Cycle  Dataset for ML  Data Pre-processing  Training versus Testing  Cross-Validation
  • 3.
    What is MachineLearning? • Definition: In Simple words ML is the study of computer algorithms. With the help of ML algorithms we can take some decision or make some predictions. • Examples • Voice assistants • Product recommendations • Predictive analytics • Image recognition
  • 4.
    What is MachineLearning? • Definition: In Simple words ML is the study of computer algorithms. With the help of ML algorithms we can take some decision or make some predictions. • Examples • Voice assistants • Product recommendations • Predictive analytics • Image recognition
  • 5.
  • 6.
    Types of MachineLearning Algorithms • • Supervised Learning: – Trained using labeled data. – Linear Regression: Used for predicting continuous outcomes. – It models the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data. – Logistic Regression: Used for binary classification tasks (e.g., predicting yes/no outcomes). – It estimates probabilities using a logistic function.
  • 7.
    Types of MachineLearning Algorithms
  • 8.
    Types of MachineLearning Algorithms • • Unsupervised Learning: – Dataset without labeled responses. – Clustering: Algorithms like K-means, hierarchical clustering, and DBSCAN group a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups.
  • 9.
    Types of MachineLearning Algorithms
  • 10.
    Types of MachineLearning Algorithms • • Semi-Supervised Learning: – Semi-supervised learning is an ML approach that trains models using a combination of a small amount of labeled data and a large amount of unlabeled data.
  • 11.
  • 12.
  • 13.
    Dataset for ML •“Data really powers everything that we do.” — Jeff Weiner. • https:// towardsdatascience.com/top-sources-for-mac hine-learning-datasets-bb6d0dc3378b
  • 14.
    Data Pre-Processing • Atthe heart of Machine Learning is to process data. 1. Importing the required Libraries 2. Importing the data set 3. Handling the Missing Data. 4. Encoding Categorical Data. 5. Splitting the data set into test set and training set. 6. Feature Scaling.
  • 15.
  • 16.
    Training Vs. TestingData • In machine learning, datasets are typically split into two subsets: training and testing data. • The training data is used to train the machine learning algorithm. • The testing data is used to evaluate the accuracy of the trained algorithm. • Nearly 70% of the whole dataset will be used as a training set and the remaining 30% will be used as a validation set.
  • 17.
  • 18.
  • 19.
    Cross-Validation • Cross-validation isa statistical technique used to assess the performance of a machine learning (ML) model. • It involves training and evaluating ML models on subsets of a dataset, and then repeating the process with different subsets. • This helps to ensure that the model is trained and tested on new data at each step. • The results of each iteration are averaged to calculate the cross-validation accuracy.
  • 20.
  • 21.