INTRODUCTION
TO
MACHINE LEARNING
CHAPTER 1
Topics Covered
1.1 Introduction to Machine Learning
 Artificial Intelligence
 Machine Learning
 Application of Machine Learning
1.2 Types of Machine Learning
1.3 Supervised Machine Learning
1.3.1 Classification
1.4 Unsupervised Machine Learning and its Application
1.4.1 Difference between Supervised and Unsupervised Machine
Learning
1.5 Semi-Supervised Machine Learning
1.6 Reinforcement Machine Learning and its Application
1.7 Hypothesis Space and Inductive Bias
1.8 Underfitting and Overfitting
1.9 Evaluation and Sampling Methods
 1.9.1 Regression Metrics
 1.9.2 Classification Metrics
1.10 Training and Test Dataset and Need of
Cross Validation
1.11 Linear Regression
 1.111 Linear Models
1.12 Decision Trees
 1.12.1 The Decision Tree Learning Algorithm
 1.12.2 Entropy
 1.12.3 Information Gain
 1.124 Impurity Measures
 Exercise
Introduction to Machine Learning
 Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of
data and algorithms to imitate the way that humans learn, gradually improving its accuracy.
 Machine Learning is an umbrella term used to describe a variety of different tools and techniques which
allow a machine or a computer program to learn and improve over time.
 ML tools and techniques include but are not limited to Statistical Reasoning, Data Mining, Mathematics and
Programming.
 To learn, a machine needs Data, Processing Power/Performance and Time. It could be said that if a machine
gets better at something over time and improves its performance as more data is acquired, then this
machine is said to be learning and we could call this process Machine Learning.
Introduction to Machine Learning
 Machines/computers an ability to learn the way humans do, i.e. without explicitly telling them what to do.
 Machine learning gives computers the ability to learn without being explicitly programmed.
Arthur Samuel
 Machine learning refers to teaching devices to learn information given to a dataset without manual human
interference.
Well Posed Learning Problem
A well-posed learning problem is a task in which the Input, Output, and Learning objective are clearly defined, and there exists a
unique solution to the problem.
A well-posed learning problem has three properties:
1. Existence: The problem must have at least one solution. There must be a possible relationship between the input and output data.
2. Uniqueness: The problem must have a unique solution. There must be only one correct relationship between the input and output
data.
3. Stability: The solution to the problem must be stable with respect to small changes in the input data. The output produced by the
machine learning algorithm should not change significantly when the input data is slightly modified.
4. A well-posed learning problem is essential for the development of effective and reliable machine learning algorithms. Without a
well-posed problem, the algorithm may produce incorrect or unstable results, making it difficult to use in practical applications.
So it is important to carefully define the input, output, and learning objective when formulating a machine learning
problem.
Well Posed Learning Problem
A learning problem can be defined as a task in which an agent (such as A Machine Learning
Algorithm or a Human) must learn to perform a specific task or make predictions based on a set of
inputs or data.
Three features that can be identified in a learning problem are:
Input data: This refers to the set of data or information that the agent uses to learn and make
predictions. The input data can be structured or unstructured, and may come from a variety of sources
such as text, images, audio, or sensor data.
Output or prediction: This refers to the task that the agent is trying to learn or the prediction that it is
trying to make based on the input data. The output can be a single value, a set of values, or a
probability distribution over possible outcomes.
Well Posed Learning Problem
Evaluation metric / Performance measure: This refers to the measure or metric that is used to evaluate
the performance of the agent on the learning task.
The evaluation metric may vary depending on the specific learning problem and may include metrics such as
Accuracy, Precision, Recall, F1 Score, or Mean Squared Error.
Definition:-
A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
Tom Mitchell
Examples of well-posed learning problems:
2. Sentiment analysis: Given a set of text documents,
Task:- Is to learn a model that can predict the sentiment
of new documents (e.g., positive, negative, or neutral).
Input:- Is the text data,
Output:- Is the sentiment label
Learning objective:- Is to minimize the prediction error.
Performance Measure :- Percentage of prediction of
the sentiments of new documents.
Training Experience :- A database of sentiments of
given documents.
1. Image classification: Given a set of labeled images,
Task:- Is to learn a model that can Correctly classify
new images into their respective classes.
Input:- Is the image data
Output:- Is the class label,
Learning objective:- Is to Minimize the Classification
Error.
Performance Measure :- Percentage of images
correctly classified.
Training Experience :- A Database of images with
given classification
Examples of well-posed learning problems:
3. Fraud detection: Given a set of transaction data,
Task:- Is to learn a model that can identify fraudulent transactions.
Input:- Is the transaction data
Output:- Is a binary label (fraudulent or not),
Learning objective:- Is to minimize the false positive and false
negative rates.
Performance Measure :- Percentage of False Positive and False
Negative Rates.
4. Regression: Given a set of input features and corresponding target
values,
Task:- Task is to learn a model that can predict the target value for
new input data
Input:- Is the feature data
Output:- Is the target value,
Learning objective:- Is to minimize the prediction error (e.g., mean
squared error).
Performance Measure :- Percentage of the prediction error.
History of Machine Learning
 Year 1950 : Alan Turing developed the Turing Test during this year.
 Year 1957 : Perceptron - The first ever Neural Network
 Year 1960 : MIT developed a Natural Language Processing program to act as a therapist. The program was called ELIZA.
 Year 1967 : The advent of Nearest Neighbor algorithm, very prominently used in Search and Approximation
 Year 1970 : Backpropagation takes shape. Backpropagation is a set of algorithms used extensively in Deep Learning.
 Year 1980 : Kunihiko Fukushima successfully built a multilayered Neural Network called ANN.
Year 1981 : Explanation Based Learning
Year 1989 : Reinforcement Learning is finally realized. Q-Learning algorithm.
Year 2009 : ImageNet
Year 2010 : Google Brain and Facebook's DeepFace
Year 2022 : ChatGPT Chat Generative Pre-trained Transformer
https://www.zeolearn.com/magazine/what-is-machine-learning
Artificial Intelligence vs. Machine Learning vs. Deep Learning vs. Neural
Networks
 Machine learning, Deep learning, and Neural networks are all sub-fields of Artificial Intelligence.
 Neural networks is a sub-field of Machine learning, and Deep learning.
 Deep" Machine learning can use labeled datasets, also known as Supervised learning. Eliminates some of the
human intervention required and enables the use of larger data sets.
 “Non-deep", Machine learning is more dependent on human intervention to learn. Human experts determine
the set of features to understand the differences between data inputs, requiring more structured data to learn.
 Neural networks, or artificial neural networks (ANNs), are comprised of node layers, containing an input layer,
one or more hidden layers, and an output layer. Each node, or artificial neuron, connects to another and has an
associated weight and threshold.
 Deep learning and Neural Networks are accelerate progress in areas such as computer vision, natural language
processing, and speech recognition.
Artificial Intelligence vs. Machine Learning vs. Deep Learning vs.
Neural Networks
 AI refers to the software and processes that are designed to mimic the way humans think and process
information. It includes computer vision, natural language processing, robotics, autonomous vehicle operating
systems, and machine learning.
 With the help of artificial intelligence, Devices are able to learn and identify information in order to solve
problems and offer key insights into various domains.
Artificial Intelligence vs. Machine Learning vs. Deep
Learning vs. Neural Networks
AI enables machines to understand data and make decisions based on patterns hidden
in data without any human intervention.
 Machines adjust their knowledge based on new inputs.
 Example, Self-driving cars , Alexa and Cortana - Conversations with us in our natural
human language
 Machine Learning:- Subset of AI
 Machine learning with the help of the algorithm can process the surplus of
information and output an accurate prediction within moments. Use deep learning all
the time.
 Uses statistical models to explore, analyze and find patterns in large amounts of
data.
 Perform tasks without being explicitly programmed, allows them to learn from
experience and improve over time without human intervention.
https://learnerjoy.com/artificial-intelligence-vs-machine-learning-vs-deep-learning-vs-data-science/
Artificial Intelligence vs. Machine Learning vs. Deep
Learning vs. Neural Networks
 Approaches:- 1. Supervised learning, 2. Unsupervised learning and 3.
Reinforcement learning.
1. Supervised learning:- Requires a human to input labelled data /Past
Labeled data into the machine and outputs a prediction of a new sample.
 2. Unsupervised learning:- Takes unlabeled data as input, groups the
data based on its similarity and outputs clusters of similar samples for the
human to analyze further reinforcement. O/p Not known. Algorithms- L-
means, Hierarchical Clustering, PCA , Neural Network.
3. Reinforcement learning. :- Reinforcement learning is also known as
semi-supervised learning. A small amount of labeled data and a large
amount of unlabeled data and utilizes a reward or trial and error system
to learn over time. Good Action and Bad Action
Artificial Intelligence vs. Machine Learning vs. Deep
Learning vs. Neural Networks
 Deep Learning - Deep learning is the subset of machine learning.
 The main idea behind deep learning is machines to learn things like the human
brain.
 Human brain is made of multitudes of neurons that allow us to operate the way
we do.
 The collection of connected neurons in a human brain, scientists create a multi-
layer network that machines could use to learn from experience and predict.
Techniques
Artificial Neural Networks (ANN):- I/P in the form of Numbers
 Convolutional Neural Networks (CNN):- I/P in the form of Images
 Recurrent neural networks (RNN). I/P in the form of Time Series Data
Two popular frameworks used in Deep learning are
•PyTorch by Facebook
•TensorFlow by Google
Artificial Intelligence vs. Machine Learning vs. Deep
Learning vs. Neural Networks
 Data Science
Data science is to perform exploratory analysis to better understand
the data.
It plays a huge role when building ML models. If you have a huge
amount of data, you will get more insights from data and accurate
results that can be applied to business use cases.
 Statistical tools –Linear algebra





























































Machine Learning Ch 1.ppt

  • 1.
  • 2.
    Topics Covered 1.1 Introductionto Machine Learning  Artificial Intelligence  Machine Learning  Application of Machine Learning 1.2 Types of Machine Learning 1.3 Supervised Machine Learning 1.3.1 Classification 1.4 Unsupervised Machine Learning and its Application 1.4.1 Difference between Supervised and Unsupervised Machine Learning 1.5 Semi-Supervised Machine Learning 1.6 Reinforcement Machine Learning and its Application 1.7 Hypothesis Space and Inductive Bias 1.8 Underfitting and Overfitting 1.9 Evaluation and Sampling Methods  1.9.1 Regression Metrics  1.9.2 Classification Metrics 1.10 Training and Test Dataset and Need of Cross Validation 1.11 Linear Regression  1.111 Linear Models 1.12 Decision Trees  1.12.1 The Decision Tree Learning Algorithm  1.12.2 Entropy  1.12.3 Information Gain  1.124 Impurity Measures  Exercise
  • 3.
    Introduction to MachineLearning  Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.  Machine Learning is an umbrella term used to describe a variety of different tools and techniques which allow a machine or a computer program to learn and improve over time.  ML tools and techniques include but are not limited to Statistical Reasoning, Data Mining, Mathematics and Programming.  To learn, a machine needs Data, Processing Power/Performance and Time. It could be said that if a machine gets better at something over time and improves its performance as more data is acquired, then this machine is said to be learning and we could call this process Machine Learning.
  • 4.
    Introduction to MachineLearning  Machines/computers an ability to learn the way humans do, i.e. without explicitly telling them what to do.  Machine learning gives computers the ability to learn without being explicitly programmed. Arthur Samuel  Machine learning refers to teaching devices to learn information given to a dataset without manual human interference.
  • 5.
    Well Posed LearningProblem A well-posed learning problem is a task in which the Input, Output, and Learning objective are clearly defined, and there exists a unique solution to the problem. A well-posed learning problem has three properties: 1. Existence: The problem must have at least one solution. There must be a possible relationship between the input and output data. 2. Uniqueness: The problem must have a unique solution. There must be only one correct relationship between the input and output data. 3. Stability: The solution to the problem must be stable with respect to small changes in the input data. The output produced by the machine learning algorithm should not change significantly when the input data is slightly modified. 4. A well-posed learning problem is essential for the development of effective and reliable machine learning algorithms. Without a well-posed problem, the algorithm may produce incorrect or unstable results, making it difficult to use in practical applications. So it is important to carefully define the input, output, and learning objective when formulating a machine learning problem.
  • 6.
    Well Posed LearningProblem A learning problem can be defined as a task in which an agent (such as A Machine Learning Algorithm or a Human) must learn to perform a specific task or make predictions based on a set of inputs or data. Three features that can be identified in a learning problem are: Input data: This refers to the set of data or information that the agent uses to learn and make predictions. The input data can be structured or unstructured, and may come from a variety of sources such as text, images, audio, or sensor data. Output or prediction: This refers to the task that the agent is trying to learn or the prediction that it is trying to make based on the input data. The output can be a single value, a set of values, or a probability distribution over possible outcomes.
  • 7.
    Well Posed LearningProblem Evaluation metric / Performance measure: This refers to the measure or metric that is used to evaluate the performance of the agent on the learning task. The evaluation metric may vary depending on the specific learning problem and may include metrics such as Accuracy, Precision, Recall, F1 Score, or Mean Squared Error. Definition:- A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. Tom Mitchell
  • 8.
    Examples of well-posedlearning problems: 2. Sentiment analysis: Given a set of text documents, Task:- Is to learn a model that can predict the sentiment of new documents (e.g., positive, negative, or neutral). Input:- Is the text data, Output:- Is the sentiment label Learning objective:- Is to minimize the prediction error. Performance Measure :- Percentage of prediction of the sentiments of new documents. Training Experience :- A database of sentiments of given documents. 1. Image classification: Given a set of labeled images, Task:- Is to learn a model that can Correctly classify new images into their respective classes. Input:- Is the image data Output:- Is the class label, Learning objective:- Is to Minimize the Classification Error. Performance Measure :- Percentage of images correctly classified. Training Experience :- A Database of images with given classification
  • 9.
    Examples of well-posedlearning problems: 3. Fraud detection: Given a set of transaction data, Task:- Is to learn a model that can identify fraudulent transactions. Input:- Is the transaction data Output:- Is a binary label (fraudulent or not), Learning objective:- Is to minimize the false positive and false negative rates. Performance Measure :- Percentage of False Positive and False Negative Rates. 4. Regression: Given a set of input features and corresponding target values, Task:- Task is to learn a model that can predict the target value for new input data Input:- Is the feature data Output:- Is the target value, Learning objective:- Is to minimize the prediction error (e.g., mean squared error). Performance Measure :- Percentage of the prediction error.
  • 12.
    History of MachineLearning  Year 1950 : Alan Turing developed the Turing Test during this year.  Year 1957 : Perceptron - The first ever Neural Network  Year 1960 : MIT developed a Natural Language Processing program to act as a therapist. The program was called ELIZA.  Year 1967 : The advent of Nearest Neighbor algorithm, very prominently used in Search and Approximation  Year 1970 : Backpropagation takes shape. Backpropagation is a set of algorithms used extensively in Deep Learning.  Year 1980 : Kunihiko Fukushima successfully built a multilayered Neural Network called ANN. Year 1981 : Explanation Based Learning Year 1989 : Reinforcement Learning is finally realized. Q-Learning algorithm. Year 2009 : ImageNet Year 2010 : Google Brain and Facebook's DeepFace Year 2022 : ChatGPT Chat Generative Pre-trained Transformer https://www.zeolearn.com/magazine/what-is-machine-learning
  • 13.
    Artificial Intelligence vs.Machine Learning vs. Deep Learning vs. Neural Networks  Machine learning, Deep learning, and Neural networks are all sub-fields of Artificial Intelligence.  Neural networks is a sub-field of Machine learning, and Deep learning.  Deep" Machine learning can use labeled datasets, also known as Supervised learning. Eliminates some of the human intervention required and enables the use of larger data sets.  “Non-deep", Machine learning is more dependent on human intervention to learn. Human experts determine the set of features to understand the differences between data inputs, requiring more structured data to learn.  Neural networks, or artificial neural networks (ANNs), are comprised of node layers, containing an input layer, one or more hidden layers, and an output layer. Each node, or artificial neuron, connects to another and has an associated weight and threshold.  Deep learning and Neural Networks are accelerate progress in areas such as computer vision, natural language processing, and speech recognition.
  • 14.
    Artificial Intelligence vs.Machine Learning vs. Deep Learning vs. Neural Networks  AI refers to the software and processes that are designed to mimic the way humans think and process information. It includes computer vision, natural language processing, robotics, autonomous vehicle operating systems, and machine learning.  With the help of artificial intelligence, Devices are able to learn and identify information in order to solve problems and offer key insights into various domains.
  • 15.
    Artificial Intelligence vs.Machine Learning vs. Deep Learning vs. Neural Networks AI enables machines to understand data and make decisions based on patterns hidden in data without any human intervention.  Machines adjust their knowledge based on new inputs.  Example, Self-driving cars , Alexa and Cortana - Conversations with us in our natural human language  Machine Learning:- Subset of AI  Machine learning with the help of the algorithm can process the surplus of information and output an accurate prediction within moments. Use deep learning all the time.  Uses statistical models to explore, analyze and find patterns in large amounts of data.  Perform tasks without being explicitly programmed, allows them to learn from experience and improve over time without human intervention. https://learnerjoy.com/artificial-intelligence-vs-machine-learning-vs-deep-learning-vs-data-science/
  • 16.
    Artificial Intelligence vs.Machine Learning vs. Deep Learning vs. Neural Networks  Approaches:- 1. Supervised learning, 2. Unsupervised learning and 3. Reinforcement learning. 1. Supervised learning:- Requires a human to input labelled data /Past Labeled data into the machine and outputs a prediction of a new sample.  2. Unsupervised learning:- Takes unlabeled data as input, groups the data based on its similarity and outputs clusters of similar samples for the human to analyze further reinforcement. O/p Not known. Algorithms- L- means, Hierarchical Clustering, PCA , Neural Network. 3. Reinforcement learning. :- Reinforcement learning is also known as semi-supervised learning. A small amount of labeled data and a large amount of unlabeled data and utilizes a reward or trial and error system to learn over time. Good Action and Bad Action
  • 17.
    Artificial Intelligence vs.Machine Learning vs. Deep Learning vs. Neural Networks  Deep Learning - Deep learning is the subset of machine learning.  The main idea behind deep learning is machines to learn things like the human brain.  Human brain is made of multitudes of neurons that allow us to operate the way we do.  The collection of connected neurons in a human brain, scientists create a multi- layer network that machines could use to learn from experience and predict. Techniques Artificial Neural Networks (ANN):- I/P in the form of Numbers  Convolutional Neural Networks (CNN):- I/P in the form of Images  Recurrent neural networks (RNN). I/P in the form of Time Series Data Two popular frameworks used in Deep learning are •PyTorch by Facebook •TensorFlow by Google
  • 18.
    Artificial Intelligence vs.Machine Learning vs. Deep Learning vs. Neural Networks  Data Science Data science is to perform exploratory analysis to better understand the data. It plays a huge role when building ML models. If you have a huge amount of data, you will get more insights from data and accurate results that can be applied to business use cases.  Statistical tools –Linear algebra
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.