MACHINE
LEARNING
ML201
- Abigail
LEARNING OBJECTIVES
 Analyze various problem statements and accurately
categorize them as either classification or regression tasks
 Develop critical thinking skills by evaluating and
considering the implications of categorizing them correctly
 Apply their understanding of classification and regression
to real-world scenarios
Traditional Programming
Vs
Machine Learning
Traditional Program:
Input Processing (based on rules) Output
→ →
The logic is defined explicitly, and the outcome is
deterministic.
Machine Learning Program:
Input Data Training Model Prediction
→ → →
The model learns from training data and can adapt to
new, unseen data, making the output probabilistic.
Why is Machine Learning
Important ?
1. Identify Use Cases
•Business Problem: Understand the specific problem
you are addressing (e.g., customer segmentation,
predictive maintenance).
1.1. Define the Problem
•Type of Problem: Determine whether it’s a classification,
regression, clustering, etc.
•Business Objectives: Understand what you want to achieve
and how success will be measured.
Define Project Requirements
•Objective: Clearly state what you want to
achieve (e.g., classification, regression,
clustering).
•Data Availability: Assess the type and
volume of data available.
•Performance Metrics: Decide how you will
measure success (e.g., accuracy, precision,
recall).
TYPES OF LEARNING
Supervised Learning is a type of machine learning where an algorithm
is trained on a labeled dataset. The goal of supervised learning is to
learn a mapping from inputs to outputs so that the model can
accurately predict the output for new, unseen data.
Key Characteristics:
•Labeled Data: The training data consists of input-output pairs, where
the output (or label) is known.
•Training Process: The model learns by adjusting its parameters to
minimize the difference between its predictions and the actual labels.
•Types of Problems: Supervised learning can be used for:
• Classification: Predicting discrete categories (e.g., spam
detection).
• Regression: Predicting continuous values (e.g., house prices).
Unsupervised Learning is a type of machine learning where the algorithm
is trained on a dataset that does not have labeled outputs. In this approach,
the model attempts to identify patterns, structures, or relationships in the
data without any guidance on what the output should be.
Key Characteristics:
•Unlabeled Data: The training data consists only of input features, with no
corresponding output labels.
•Pattern Discovery: The model seeks to learn the underlying structure or
distribution of the data, grouping similar data points together or identifying
anomalies.
•Types of Problems: Unsupervised learning can be used for:
• Clustering: Grouping similar data points (e.g., customer
segmentation).
• Dimensionality Reduction: Reducing the number of features while
retaining important information (e.g., PCA).
Activity 1
Problem Statements :
• Predicting whether an email is spam or not.
• Estimating the price of a used car based on its
features.
• Classifying customer reviews as positive, negative,
or neutral.
• Predicting the temperature for tomorrow.
• Identifying whether an image contains a cat or a
dog.
• Forecasting sales numbers for the next quarter.
• Determining if a loan application is approved or
Problem Statement
You have the following dataset consisting of three points in a 2D space
(each point has two features: x and y):
1.Point A: (2, 3)
2.Point B: (4, 5)
3.Point C: (6, 7)
Task: Calculate the centroid of these three points.
Steps to Solve the Problem
4.List the Points: Identify the coordinates of each point.
5.Calculate the Centroid:
6.Compute the Final Result.
SUMMARY
• Classification deals with categorical
outputs.
• Regression deals with continuous
outputs.

MACHINE LEARNING INTRODUCTION FOR BEGINNERS

  • 1.
  • 2.
    LEARNING OBJECTIVES  Analyzevarious problem statements and accurately categorize them as either classification or regression tasks  Develop critical thinking skills by evaluating and considering the implications of categorizing them correctly  Apply their understanding of classification and regression to real-world scenarios
  • 3.
  • 4.
    Traditional Program: Input Processing(based on rules) Output → → The logic is defined explicitly, and the outcome is deterministic. Machine Learning Program: Input Data Training Model Prediction → → → The model learns from training data and can adapt to new, unseen data, making the output probabilistic.
  • 5.
    Why is MachineLearning Important ?
  • 6.
    1. Identify UseCases •Business Problem: Understand the specific problem you are addressing (e.g., customer segmentation, predictive maintenance). 1.1. Define the Problem •Type of Problem: Determine whether it’s a classification, regression, clustering, etc. •Business Objectives: Understand what you want to achieve and how success will be measured.
  • 7.
    Define Project Requirements •Objective:Clearly state what you want to achieve (e.g., classification, regression, clustering). •Data Availability: Assess the type and volume of data available. •Performance Metrics: Decide how you will measure success (e.g., accuracy, precision, recall).
  • 8.
  • 9.
    Supervised Learning isa type of machine learning where an algorithm is trained on a labeled dataset. The goal of supervised learning is to learn a mapping from inputs to outputs so that the model can accurately predict the output for new, unseen data. Key Characteristics: •Labeled Data: The training data consists of input-output pairs, where the output (or label) is known. •Training Process: The model learns by adjusting its parameters to minimize the difference between its predictions and the actual labels. •Types of Problems: Supervised learning can be used for: • Classification: Predicting discrete categories (e.g., spam detection). • Regression: Predicting continuous values (e.g., house prices).
  • 10.
    Unsupervised Learning isa type of machine learning where the algorithm is trained on a dataset that does not have labeled outputs. In this approach, the model attempts to identify patterns, structures, or relationships in the data without any guidance on what the output should be. Key Characteristics: •Unlabeled Data: The training data consists only of input features, with no corresponding output labels. •Pattern Discovery: The model seeks to learn the underlying structure or distribution of the data, grouping similar data points together or identifying anomalies. •Types of Problems: Unsupervised learning can be used for: • Clustering: Grouping similar data points (e.g., customer segmentation). • Dimensionality Reduction: Reducing the number of features while retaining important information (e.g., PCA).
  • 12.
    Activity 1 Problem Statements: • Predicting whether an email is spam or not. • Estimating the price of a used car based on its features. • Classifying customer reviews as positive, negative, or neutral. • Predicting the temperature for tomorrow. • Identifying whether an image contains a cat or a dog. • Forecasting sales numbers for the next quarter. • Determining if a loan application is approved or
  • 13.
    Problem Statement You havethe following dataset consisting of three points in a 2D space (each point has two features: x and y): 1.Point A: (2, 3) 2.Point B: (4, 5) 3.Point C: (6, 7) Task: Calculate the centroid of these three points. Steps to Solve the Problem 4.List the Points: Identify the coordinates of each point. 5.Calculate the Centroid: 6.Compute the Final Result.
  • 14.
    SUMMARY • Classification dealswith categorical outputs. • Regression deals with continuous outputs.

Editor's Notes

  • #12 Problem Statements and Answers: Predicting whether an email is spam or not. Answer: Classification (It's a binary classification problem: spam vs. not spam.) Estimating the price of a used car based on its features. Answer: Regression (The output is a continuous value: the price of the car.) Classifying customer reviews as positive, negative, or neutral. Answer: Classification (This is a multiclass classification problem.) Predicting the temperature for tomorrow. Answer: Regression (The output is a continuous value: the temperature.) Identifying whether an image contains a cat or a dog. Answer: Classification (It's a binary classification problem: cat vs. dog.) Forecasting sales numbers for the next quarter. Answer: Regression (The output is a continuous value: the sales number.) Determining if a loan application is approved or denied. Answer: Classification (It's a binary classification problem: approved vs. denied.)
  • #13 Problem Statements and Answers: Predicting whether an email is spam or not. Answer: Classification (It's a binary classification problem: spam vs. not spam.) Estimating the price of a used car based on its features. Answer: Regression (The output is a continuous value: the price of the car.) Classifying customer reviews as positive, negative, or neutral. Answer: Classification (This is a multiclass classification problem.) Predicting the temperature for tomorrow. Answer: Regression (The output is a continuous value: the temperature.) Identifying whether an image contains a cat or a dog. Answer: Classification (It's a binary classification problem: cat vs. dog.) Forecasting sales numbers for the next quarter. Answer: Regression (The output is a continuous value: the sales number.) Determining if a loan application is approved or denied. Answer: Classification (It's a binary classification problem: approved vs. denied.)