CSE 465
Machine Learning
Lecture-1
1
What is Machine Learning?
Machine learning is a branch of artificial intelligence
(AI) and computer science which focuses on the use
of data and algorithms to imitate the way that
humans learn, gradually improving its accuracy.
Artificial intelligence leverages computers and machines to mimic the
problem-solving and decision-making capabilities of the human mind
2
3
How does Machine Learning work
4
How does Machine Learning work
•A machine learning system builds prediction
models, learns from previous data, and predicts the
output of new data whenever it receives it.
•Let's say we have a complex problem in which we
need to make predictions. Instead of writing code,
we just need to feed the data to generic
algorithms, which build the logic based on the data
and predict the output.
5
Machine learning methods
Machine learning models fall into three primary
categories
•Supervised machine learning
•Unsupervised machine learning
•Semi-supervised learning
6
7
Supervised machine learning
•Supervised machine learning is defined by its use of
labeled datasets to train algorithms to classify data
or predict outcomes accurately.
•Supervised learning helps organizations solve a
variety of real-world problems such as classifying
spam in a separate folder from your inbox and
regression problems.
•Some methods used in supervised learning include
neural networks, naïve bayes, linear regression,
logistic regression, random forest, and support
vector machine (SVM).
8
9
Supervised machine learning
There are two main categories of supervised learning
that are mentioned below:
• Classification
• Logistic Regression
• Decision Tree Classifier
• K Nearest Neighbor Classifier
• Random Forest Classifier
• Neural Networks
• Regression
• Linear Regression
• Decision Tree Regressor
• K Nearest Neighbor Regressor
• Random Forest Regressor
• Neural Networks
10
Classification
• Classification algorithms are used to group data by
predicting a categorical label or output variable based
on the input data.
• Classification is used when output variables are
categorical (or discrete), meaning there are two or
more classes.
• One of the most common examples of classification
algorithms in use is the spam filter in your email inbox.
11
The Importance of Classification
•The most straight-forward way for a computer
program to understand human intelligence.
•The fundamental way for computer intelligence to
understand this world by true (1) or false (0).
12
Supervised Learning: Definition
•Given a collection of records (training set )
• Each record contains a set of attributes, one of the attributes
is the class.
•Find a model for class attribute as a function of
the values of other attributes.
•Goal: previously unseen records should be
assigned a class as accurately as possible.
• A test set is used to determine the accuracy of the model.
Usually, the given data set is divided into training and test
sets, with training set used to build the model and test set
used to validate it.
13
Illustrating Supervised Learning
14
An example of learned model
15
An example of learned model
16
Let’s choose income as initial
condition
17
Regression
•Regression algorithms are used to predict a real or
continuous value, where the algorithm detects a
relationship between two or more variables.
•A common example of a regression task might be
predicting a salary based on work experience.
18
Unsupervised machine learning
• Unsupervised machine learning uses machine learning
algorithms to analyze and cluster unlabeled datasets.
• These algorithms discover hidden patterns,
relationships or data groupings without the need for
human intervention.
• This method’s ability to discover similarities and
differences in information make it ideal for exploratory
data analysis, cross-selling strategies, customer
segmentation, dimensionality reduction, and image and
pattern recognition.
19
20
Unsupervised machine learning
There are two main categories of unsupervised
learning that are mentioned below:
•Clustering
• K-Means Clustering algorithm
•Association
• Apriori Algorithm
21
•Clustering
Clustering is the process of grouping data points into
clusters based on their similarity. This technique is useful
for identifying patterns and relationships in data without
the need for labeled examples.
•Association
Association rule learning is a technique for discovering
relationships between items in a dataset. It identifies rules
that indicate the presence of one item implies the
presence of another item with a specific probability.
22
23
An example application
• An emergency room in a hospital measures 17 variables
(e.g., blood pressure, age, etc.) of newly admitted
patients.
• A decision is needed: whether to put a new patient in
an intensive-care unit.
• Due to the high cost of ICU, those patients who may
survive less than a month are given higher priority.
• Problem: to predict high-risk patients and discriminate
them from low-risk patients.
24
Another application
• A credit card company receives thousands of applications
for new cards. Each application contains information
about an applicant,
• age
• Marital status
• annual salary
• outstanding debts
• credit rating
• etc.
• Problem: to decide whether an application should
approved, or to classify applications into two categories,
approved and not approved.
25
Machine learning and our focus
• Like human learning from past experiences.
• A computer does not have “experiences”.
• A computer system learns from data, which represent
some “past experiences” of an application domain.
• Our focus: learn a target function that can be used to
predict the values of a discrete class attribute, e.g.,
approve or not-approved, and high-risk or low risk.
• The task is commonly called: Supervised learning,
classification, or inductive learning.
26
•Data: A set of data records (also called examples,
instances or cases) described by
• k attributes: A1
, A2
, … Ak
.
• a class: Each example is labelled with a pre-defined class.
•Goal: To learn a classification model from the data
that can be used to predict the classes of new (future,
or test) cases/instances.
The data and the goal
27
An example: data (loan
application) Approved or not
28
An example: the learning task
• Learn a classification model from the data
• Use the model to classify future loan applications into
• Yes (approved) and
• No (not approved)
• What is the class for following case/instance?
29
Supervised vs. unsupervised
Learning
• Supervised learning: classification is seen as supervised
learning from examples.
• Supervision: The data (observations, measurements, etc.)
are labeled with pre-defined classes. It is like that a
“teacher” gives the classes (supervision).
• Test data are classified into these classes too.
• Unsupervised learning (clustering)
• Class labels of the data are unknown
• Given a set of data, the task is to establish the existence of
classes or clusters in the data
30
Supervised learning process: two steps
■ Learning (training): Learn a model using the
training data
■ Testing: Test the model using unseen test
data to assess the model accuracy
Semi-supervised learning
• Semi-supervised learning is a type of machine learning that
falls in between supervised and unsupervised learning.
• This method uses a small amount of labeled data and a large
amount of unlabeled data to train a model.
• The goal of semi-supervised learning is to learn a function
that can accurately predict the output variable based on the
input variables, similar to supervised learning. However,
unlike supervised learning, the algorithm is trained on a
dataset that contains both labeled and unlabeled data.
• Semi-supervised learning is particularly useful when there is
a large amount of unlabeled data available, but it’s too
expensive or difficult to label all of it.
31
Semi-supervised learning
32
Reinforcement learning
• Reinforcement learning is a machine learning model that is
similar to supervised learning, but the algorithm is not
trained using sample data. This model learns as it goes by
using trial and error.
• An agent learns to make decisions by interacting with its
environment. The agent is rewarded or penalized (with
points) for the actions it takes, and its goal is to maximize
the total reward.
• Unlike supervised and unsupervised learning, reinforcement
learning is particularly suited to problems where the data is
sequential, and the decision made at each step can affect
future outcomes.
• Common examples of reinforcement learning include game
playing, robotics, resource management, and many more.
33
Reinforcement learning
34
Applications of Machine Learning
•Recommendation systems
Companies like Netflix and Amazon use machine learning to
analyze your past behavior and recommend products or
movies you might like.
•Voice assistants
Voice assistants like Siri, Alexa, and Google Assistant use
machine learning to understand your voice commands and
provide relevant responses. They continually learn from your
interactions to improve their performance.
35
Applications of Machine Learning
•Fraud detection
Banks and credit card companies use machine learning to
detect fraudulent transactions. By analyzing patterns of
normal and abnormal behavior, they can flag suspicious
activity in real-time.
•Social media
Social media platforms use machine learning for a variety of
tasks, from personalizing your feed to filtering out
inappropriate content.
36
Impact of Machine Learning
• Healthcare
In healthcare, machine learning is used to predict disease
outbreaks, personalize patient treatment plans, and
improve medical imaging accuracy. For instance, Google's
DeepMind Health is working with doctors to build
machine learning models to detect diseases earlier and
improve patient care.
• Finance
The finance sector has also greatly benefited from
machine learning. It's used for credit scoring, algorithmic
trading, and fraud detection. A recent survey found that
56% of global executives said that artificial intelligence
(AI) and machine learning have been implemented into
financial crime compliance programs.
37
Impact of Machine Learning
•Transportation
Machine learning is at the heart of the self-driving
car revolution. Companies like Tesla and Waymo use
machine learning algorithms to interpret sensor data
in real-time, allowing their vehicles to recognize
objects, make decisions, and navigate roads
autonomously.
38
Task
•Let’s dataset is given on slide 27. New data is given
on slide 28. Which algorithm will you apply for
predict the result? Write down the reasons of
choosing that algorithm. By applying the algorithm,
what will be the class of (slide 28’s) data?
39
References
• https://images.datacamp.com/image/upload/v1689699865/Our_machi
ne_learning_cheat_sheet_covers_different_algorithms_and_their_uses
_93758bf278.png
• https://www.geeksforgeeks.org/types-of-machine-learning/?ref=lbp
• https://www.javatpoint.com/difference-between-supervised-and-unsup
ervised-learning
40

Lecture_1__Introduction_to_MachineLearning.pdf

  • 1.
  • 2.
    What is MachineLearning? Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy. Artificial intelligence leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind 2
  • 3.
  • 4.
    How does MachineLearning work 4
  • 5.
    How does MachineLearning work •A machine learning system builds prediction models, learns from previous data, and predicts the output of new data whenever it receives it. •Let's say we have a complex problem in which we need to make predictions. Instead of writing code, we just need to feed the data to generic algorithms, which build the logic based on the data and predict the output. 5
  • 6.
    Machine learning methods Machinelearning models fall into three primary categories •Supervised machine learning •Unsupervised machine learning •Semi-supervised learning 6
  • 7.
  • 8.
    Supervised machine learning •Supervisedmachine learning is defined by its use of labeled datasets to train algorithms to classify data or predict outcomes accurately. •Supervised learning helps organizations solve a variety of real-world problems such as classifying spam in a separate folder from your inbox and regression problems. •Some methods used in supervised learning include neural networks, naïve bayes, linear regression, logistic regression, random forest, and support vector machine (SVM). 8
  • 9.
  • 10.
    Supervised machine learning Thereare two main categories of supervised learning that are mentioned below: • Classification • Logistic Regression • Decision Tree Classifier • K Nearest Neighbor Classifier • Random Forest Classifier • Neural Networks • Regression • Linear Regression • Decision Tree Regressor • K Nearest Neighbor Regressor • Random Forest Regressor • Neural Networks 10
  • 11.
    Classification • Classification algorithmsare used to group data by predicting a categorical label or output variable based on the input data. • Classification is used when output variables are categorical (or discrete), meaning there are two or more classes. • One of the most common examples of classification algorithms in use is the spam filter in your email inbox. 11
  • 12.
    The Importance ofClassification •The most straight-forward way for a computer program to understand human intelligence. •The fundamental way for computer intelligence to understand this world by true (1) or false (0). 12
  • 13.
    Supervised Learning: Definition •Givena collection of records (training set ) • Each record contains a set of attributes, one of the attributes is the class. •Find a model for class attribute as a function of the values of other attributes. •Goal: previously unseen records should be assigned a class as accurately as possible. • A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it. 13
  • 14.
  • 15.
    An example oflearned model 15
  • 16.
    An example oflearned model 16
  • 17.
    Let’s choose incomeas initial condition 17
  • 18.
    Regression •Regression algorithms areused to predict a real or continuous value, where the algorithm detects a relationship between two or more variables. •A common example of a regression task might be predicting a salary based on work experience. 18
  • 19.
    Unsupervised machine learning •Unsupervised machine learning uses machine learning algorithms to analyze and cluster unlabeled datasets. • These algorithms discover hidden patterns, relationships or data groupings without the need for human intervention. • This method’s ability to discover similarities and differences in information make it ideal for exploratory data analysis, cross-selling strategies, customer segmentation, dimensionality reduction, and image and pattern recognition. 19
  • 20.
  • 21.
    Unsupervised machine learning Thereare two main categories of unsupervised learning that are mentioned below: •Clustering • K-Means Clustering algorithm •Association • Apriori Algorithm 21
  • 22.
    •Clustering Clustering is theprocess of grouping data points into clusters based on their similarity. This technique is useful for identifying patterns and relationships in data without the need for labeled examples. •Association Association rule learning is a technique for discovering relationships between items in a dataset. It identifies rules that indicate the presence of one item implies the presence of another item with a specific probability. 22
  • 23.
    23 An example application •An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc.) of newly admitted patients. • A decision is needed: whether to put a new patient in an intensive-care unit. • Due to the high cost of ICU, those patients who may survive less than a month are given higher priority. • Problem: to predict high-risk patients and discriminate them from low-risk patients.
  • 24.
    24 Another application • Acredit card company receives thousands of applications for new cards. Each application contains information about an applicant, • age • Marital status • annual salary • outstanding debts • credit rating • etc. • Problem: to decide whether an application should approved, or to classify applications into two categories, approved and not approved.
  • 25.
    25 Machine learning andour focus • Like human learning from past experiences. • A computer does not have “experiences”. • A computer system learns from data, which represent some “past experiences” of an application domain. • Our focus: learn a target function that can be used to predict the values of a discrete class attribute, e.g., approve or not-approved, and high-risk or low risk. • The task is commonly called: Supervised learning, classification, or inductive learning.
  • 26.
    26 •Data: A setof data records (also called examples, instances or cases) described by • k attributes: A1 , A2 , … Ak . • a class: Each example is labelled with a pre-defined class. •Goal: To learn a classification model from the data that can be used to predict the classes of new (future, or test) cases/instances. The data and the goal
  • 27.
    27 An example: data(loan application) Approved or not
  • 28.
    28 An example: thelearning task • Learn a classification model from the data • Use the model to classify future loan applications into • Yes (approved) and • No (not approved) • What is the class for following case/instance?
  • 29.
    29 Supervised vs. unsupervised Learning •Supervised learning: classification is seen as supervised learning from examples. • Supervision: The data (observations, measurements, etc.) are labeled with pre-defined classes. It is like that a “teacher” gives the classes (supervision). • Test data are classified into these classes too. • Unsupervised learning (clustering) • Class labels of the data are unknown • Given a set of data, the task is to establish the existence of classes or clusters in the data
  • 30.
    30 Supervised learning process:two steps ■ Learning (training): Learn a model using the training data ■ Testing: Test the model using unseen test data to assess the model accuracy
  • 31.
    Semi-supervised learning • Semi-supervisedlearning is a type of machine learning that falls in between supervised and unsupervised learning. • This method uses a small amount of labeled data and a large amount of unlabeled data to train a model. • The goal of semi-supervised learning is to learn a function that can accurately predict the output variable based on the input variables, similar to supervised learning. However, unlike supervised learning, the algorithm is trained on a dataset that contains both labeled and unlabeled data. • Semi-supervised learning is particularly useful when there is a large amount of unlabeled data available, but it’s too expensive or difficult to label all of it. 31
  • 32.
  • 33.
    Reinforcement learning • Reinforcementlearning is a machine learning model that is similar to supervised learning, but the algorithm is not trained using sample data. This model learns as it goes by using trial and error. • An agent learns to make decisions by interacting with its environment. The agent is rewarded or penalized (with points) for the actions it takes, and its goal is to maximize the total reward. • Unlike supervised and unsupervised learning, reinforcement learning is particularly suited to problems where the data is sequential, and the decision made at each step can affect future outcomes. • Common examples of reinforcement learning include game playing, robotics, resource management, and many more. 33
  • 34.
  • 35.
    Applications of MachineLearning •Recommendation systems Companies like Netflix and Amazon use machine learning to analyze your past behavior and recommend products or movies you might like. •Voice assistants Voice assistants like Siri, Alexa, and Google Assistant use machine learning to understand your voice commands and provide relevant responses. They continually learn from your interactions to improve their performance. 35
  • 36.
    Applications of MachineLearning •Fraud detection Banks and credit card companies use machine learning to detect fraudulent transactions. By analyzing patterns of normal and abnormal behavior, they can flag suspicious activity in real-time. •Social media Social media platforms use machine learning for a variety of tasks, from personalizing your feed to filtering out inappropriate content. 36
  • 37.
    Impact of MachineLearning • Healthcare In healthcare, machine learning is used to predict disease outbreaks, personalize patient treatment plans, and improve medical imaging accuracy. For instance, Google's DeepMind Health is working with doctors to build machine learning models to detect diseases earlier and improve patient care. • Finance The finance sector has also greatly benefited from machine learning. It's used for credit scoring, algorithmic trading, and fraud detection. A recent survey found that 56% of global executives said that artificial intelligence (AI) and machine learning have been implemented into financial crime compliance programs. 37
  • 38.
    Impact of MachineLearning •Transportation Machine learning is at the heart of the self-driving car revolution. Companies like Tesla and Waymo use machine learning algorithms to interpret sensor data in real-time, allowing their vehicles to recognize objects, make decisions, and navigate roads autonomously. 38
  • 39.
    Task •Let’s dataset isgiven on slide 27. New data is given on slide 28. Which algorithm will you apply for predict the result? Write down the reasons of choosing that algorithm. By applying the algorithm, what will be the class of (slide 28’s) data? 39
  • 40.