SlideShare a Scribd company logo
1 of 113
INTRODUCTION TO
MACHINE
LEARNING
By
Archana M
Department of computer science
Introduction
• Machine Learning or ML is one of the most
successful applications of Artificial intelligence
which provides systems with automated learning
without being constantly programmed.
• It has acquired a ton of noticeable quality lately due
to its capacity to be applied across scores of ventures
to tackle complex issues quickly and effectively.
• From Digital assistants that play your music to the
products being recommended based on prior search,
Machine Learning has taken over many aspects of
our life.
• It is a skill in high demand as companies require
software that can grasp data and provide accurate
results. The core objective is to obtain optimal
functions with less confusion.
WHAT IS MACHINE
LEARNING?
• Machine Learning is a segment that comes under
Artificial Intelligence (AI) that increases the quality
of applications by using previously assimilated data.
It programs systems to learn and grasp data without
having to feed a new code for every new similar
activity.
• The aim is for the flow to be automated rather than
continuously modified. Hence by experience and past
intel, it improves the program by itself.
WHY MACHINE LEARNING?
• The domain of Machine Learning is a continuously
evolving field with high demand. Without human
intervention, it delivers real-time results using the
already existing and processed data.
• It generally helps analyze and assess large amounts of
data with ease by developing data-driven models. As
of today, Machine Learning has become a fast and
efficient way for firms to build models and strategize
plans.
ADVANTAGES OF MACHINE
LEARNING
• Completely Automated ( Zero human intervention)
• Analyses large amounts of data
• More efficient than traditional data analytical methods
• Identifies trends and patterns with ease
• Reliable and efficient
• Less usage of workforce
• Handles a variety of data
• Accommodates for most forms of applications
COMMONLY USED ALGORITHMS IN MACHINE LEARNING
There are many different models in Machine Learning. Here are the most
commonly used algorithms in the world today-
• Gradient Boosting algorithms dimensionality Reduction Algorithms
• Random Forest
• K-Means
• KNN
• Naive Bayes
• SVM
• Decision Tree
• Logistic Regression
• Linear Regression
• Machine Learning or ML is slowly but steadily having
a huge impact on data-driven business decisions
across the globe. It has also helped organizations
with the correct intel to make more informed, data-
driven choices that are quicker than conventional
methodologies.
• Yet, there are many issues in Machine Learning that
cannot be overlooked in spite of its high
productivity.
Machine Learning : A General
Perspective
• The goal in the machine learning is to recognize the
pattern in the dataset, in general manner. After you
recognize the patterns, you can use this information
to model the data, to interpret the data, or to predict
the outcome of the new data which hasn’t seen
before.
• Machine learning is a subfield of artificial intelligence and
machine learning algorithms are used in other related
fields like natural language processing and computer
vision.
• In general, there are three types of learning and these are
• supervised learning,
• unsupervised learning, and
• reinforcement learning.
Their names tell the main idea behind them actually.
Supervised learning
In supervised learning, your system learns under the
supervision of the data outputs so supervised algorithms are
preferred if your dataset contains output information.
Let me give you an example in there.
• Let’s assume you have a medical statistic company and
you have a dataset which contains patients’ features like
blood pressure, sugar rate in their blood, heart rate per
minute, etc.
• and also you have the information about if they have
experienced heart disease in their life or not.
• By training a machine learning algorithm, your
system can find a pattern between features and the
probability to experience heart disease. Therefore
your algorithm can predict whether a new patient has
a risk to experience a heart disease, so doctor takes
the precautions and save a person’s life
A Decision Tree from one of projects. Where x’s are features which are medical
tests in this case and the 0,1 values in boxes represents existence of heart disease.
As you can see, algorithm produced an interpretable tree.
Unsupervised Learning
• In unsupervised algorithms if your data doesn’t contain
output and if you would like to discover the clusters in
dataset.
• A good example of unsupervised learning is handwritten
digit recognition.
• In this application you know that there should be 10
clusters {0,1,2,3,4,5,6,7,8,9} but the problem in
handwritten digits is that there are countless ways to write
a digit by hand, and everyone write digits differently.
• How does a computer understand what is written with hand?
• In there, you should use an unsupervised algorithm like K-
means or EM-algorithm.
• What you do with these algorithms is that you start with initial
random cluster means and iteratively these mean points
converge to real cluster mean values. After you complete the
training, if you visualize the means of the clusters you can see
that they really look like digits. Then you label these clusters
with corresponding digits, and when the computer encounters a
new handwritten digit, algorithm labels the digit with the mean
which is closest to it.
Reinforcement learning
Let’s assume you want to create an intelligent agent
which plays chess.
In chess, you can’t handle movements one by one.
Your agent should consider a series of movements and then
decide to take an action which would maximize the utility.
Therefore your agent should play a couple of turns against
itself and decide the best action to take. We call this type of
learning as reinforcement learning and it is generally used in
games.
• Poor Quality of Data
• Under fitting of Training Data
• Over fitting of Training Data
• Machine Learning is a Complex Process
• Lack of Training Data
• Slow Implementation
• Imperfections in the Algorithm When Data Grows
1. Poor Quality of Data
• Data plays a significant role in the machine learning
process. One of the significant issues that machine
learning professionals face is the absence of good quality
data. Unclean and noisy data can make the whole process
extremely exhausting. We don’t want our algorithm to
make inaccurate or faulty predictions.
• Hence the quality of data is essential to enhance the
output. Therefore, we need to ensure that the process of
data preprocessing which includes removing outliers,
filtering missing values, and removing unwanted features,
is done with the utmost level of perfection.
2.Underfitting of Training Data
This process occurs when data is unable to establish an accurate
relationship between input and output variables. It simply means
trying to fit in undersized jeans. It signifies the data is too simple
to establish a precise relationship. To overcome this issue:
• Maximize the training time
• Enhance the complexity of the model
• Add more features to the data
• Reduce regular parameters
• Increasing the training time of model
3. Overfitting of Training Data
• Overfitting refers to a machine learning model
trained with a massive amount of data that negatively
affect its performance. It is like trying to fit in
Oversized jeans. Unfortunately, this is one of the
significant issues faced by machine learning
professionals. This means that the algorithm is
trained with noisy and biased data, which will affect
its overall performance
• . Let’s understand this with the help of an example.
Let’s consider a model trained to differentiate
between a cat, a rabbit, a dog, and a tiger. The
training data contains 1000 cats, 1000 dogs, 1000
tigers, and 4000 Rabbits. Then there is a considerable
probability that it will identify the cat as a rabbit. In
this example, we had a vast amount of data, but it
was biased; hence the prediction was negatively
affected.
4. Machine Learning is a
Complex Process
• The machine learning industry is young and is
continuously changing. Rapid hit and trial experiments are
being carried on. The process is transforming, and hence
there are high chances of error which makes the learning
complex.
• It includes analyzing the data, removing data bias, training
data, applying complex mathematical calculations, and a
lot more. Hence it is a really complicated process which is
another big challenge for Machine learning professionals
5. Lack of Training Data
• The most important task you need to do in the
machine learning process is to train the data to
achieve an accurate output. Less amount training data
will produce inaccurate or too biased predictions.
• Let us understand this with the help of an example.
• Consider a machine learning algorithm similar to training a
child. One day you decided to explain to a child how to
distinguish between an apple and a watermelon. You will take
an apple and a watermelon and show him the difference
between both based on their color, shape, and taste.
• In this way, soon, he will attain perfection in differentiating
between the two.
• But on the other hand, a machine-learning algorithm needs a lot
of data to distinguish. For complex problems, it may even
require millions of data to be trained. Therefore we need to
ensure that Machine learning algorithms are trained with
sufficient amounts of data.
6. Slow Implementation
• This is one of the common issues faced by machine
learning professionals. The machine learning models
are highly efficient in providing accurate results, but
it takes a tremendous amount of time.
• Slow programs, data overload, and excessive
requirements usually take a lot of time to provide
accurate results. Further, it requires constant
monitoring and maintenance to deliver the best
output.
7. Imperfections in the Algorithm
When Data Grows
• The model may become useless in the future as data
grows. The best model of the present may become
inaccurate in the coming Future and require further
rearrangement. So you need regular monitoring and
maintenance to keep the algorithm working. This is
one of the most exhausting issues faced by machine
learning professionals.
Conclusion:
• It is one of the most rapidly growing technologies
used in medical diagnosis, speech recognition,
robotic training, product recommendations, video
surveillance, and this list goes on
Design a Learning System in
Machine Learning
According to Arthur Samuel “Machine
Learning enables a Machine to Automatically learn
from Data, Improve performance from an Experience
and predict things without explicitly programmed.”
• In Simple Words, When we fed the Training Data to
Machine Learning Algorithm, this algorithm will
produce a mathematical model and with the help of
the mathematical model, the machine will make a
prediction and take a decision without being
explicitly programmed.
• Also, during training data, the more machine will
work with it the more it will get experience and the
more efficient result is produced.
Example :
• In Driverless Car, the training data is fed to
Algorithm like how to Drive Car in Highway, Busy
and Narrow Street with factors like speed limit,
parking, stop at signal etc.
• After that, a Logical and Mathematical model is
created on the basis of that and after that, the car will
work according to the logical model. Also, the more
data the data is fed the more efficient output is
produced
• According to Tom Mitchell, “A computer program is
said to be learning from experience (E), with respect
to some task (T). Thus, the performance measure (P)
is the performance at task T, which is measured by P,
and it improves with experience E.”
Example: In Spam E-Mail detection,
• Task, T: To classify mails into Spam or Not Spam.
• Performance measure, P: Total percent of mails
being correctly classified as being “Spam” or “Not
Spam”.
• Experience, E: Set of Mails with label “Spam”
Steps for Designing Learning System
are:
Step 1) Choosing the Training
Experience:
• The very important and first task is to choose the
training data or training experience which will be fed
to the Machine Learning Algorithm. It is important
to note that the data or experience that we fed to the
algorithm must have a significant impact on the
Success or Failure of the Model. So Training data or
experience should be chosen wisely.
Below are the attributes which will
impact on Success and Failure of
Data
• The training experience will be able to provide direct
or indirect feedback regarding choices. For example:
While Playing chess the training data will provide
feedback to itself like instead of this move if this is
chosen the chances of success increases.
• Second important attribute is the degree to which the
learner will control the sequences of training
examples. For example: when training data is fed to
the machine then at that time accuracy is very less
but when it gains experience while playing again and
again with itself or opponent the machine algorithm
will get feedback and control the chess game
accordingly.
• Third important attribute is how it will represent the
distribution of examples over which performance
will be measured. For example, a Machine learning
algorithm will get experience while going through a
number of different cases and different examples.
Thus, Machine Learning Algorithm will get more and
more experience by passing through more and more
examples and hence its performance will increase.
Step 2- Choosing target function:
• The next important step is choosing the target
function. It means according to the knowledge fed to
the algorithm the machine learning will choose
NextMove function which will describe what type of
legal moves should be taken.
• For example : While playing chess with the
opponent, when opponent will play then the machine
learning algorithm will decide what be the number of
possible legal moves taken in order to get success.
Step 3- Choosing Representation
for Target function:
• When the machine algorithm will know all the
possible legal moves the next step is to choose the
optimized move using any representation i.e. using
linear Equations, Hierarchical Graph Representation,
Tabular form etc. The NextMove function will move
the Target move like out of these move which will
provide more success rate.
• For Example : while playing chess machine have 4
possible moves, so the machine will choose that
optimized move which will provide success to it.
Step 4- Choosing Function
Approximation Algorithm:
• An optimized move cannot be chosen just with the
training data. The training data had to go through
with set of example and through these examples the
training data will approximates which steps are
chosen and after that machine will provide feedback
on it.
• For Example : When a training data of Playing chess
is fed to algorithm so at that time it is not machine
algorithm will fail or get success and again from that
failure or success it will measure while next move
what step should be chosen and what is its success
rate.
Step 5- Final Design:
• The final design is created at last when system goes
from number of examples , failures and success ,
correct and incorrect decision and what will be the
next step etc. Example: DeepBlue is an
intelligent computer which is ML-based won chess
game against the chess expert Garry Kasparov, and it
became the first computer which had beaten a
human chess expert.
Concept of Hypothesis
• The hypothesis is a common term in Machine Learning and data science
projects. As we know, machine learning is one of the most powerful
technologies across the world, which helps us to predict results based on
past experiences.
• Moreover, data scientists and ML professionals conduct experiments that
aim to solve a problem. These ML professionals and data scientists make
an initial assumption for the solution of the problem.
• This assumption in Machine learning is known as Hypothesis.
• In Machine Learning, at various times, Hypothesis and Model are used
interchangeably. However, a Hypothesis is an assumption made by
scientists, whereas a model is a mathematical representation that is used to
test the hypothesis
What is Hypothesis?
• The hypothesis is defined as the supposition or proposed
explanation based on insufficient evidence or
assumptions. It is just a guess based on some known facts but
has not yet been proven. A good hypothesis is testable, which
results in either true or false.
• Example: Let's understand the hypothesis with a common
example. Some scientist claims that ultraviolet (UV) light can
damage the eyes then it may also cause blindness.
• In this example, a scientist just claims that UV rays are harmful
to the eyes, but we assume they may cause blindness. However,
it may or may not be possible. Hence, these types of
assumptions are called a hypothesis.
Two important types of hypotheses
as follows:
• Null Hypothesis: A null hypothesis is a type of
statistical hypothesis which tells that there is no
statistically significant effect exists in the given set of
observations. It is also known as conjecture and is
used in quantitative analysis to test theories about
markets, investment, and finance to decide whether
an idea is true or false.
Alternative Hypothesis:
• An alternative hypothesis is a direct contradiction of
the null hypothesis, which means if one of the two
hypotheses is true, then the other must be false. In
other words, an alternative hypothesis is a type of
statistical hypothesis which tells that there is some
significant effect that exists in the given set of
observations
Hypothesis in Machine Learning
(ML)
• The hypothesis is one of the commonly used
concepts of statistics in Machine Learning. It is
specifically used in Supervised Machine learning,
where an ML model learns a function that best maps
the input to corresponding outputs with the help of
an available dataset.
The following figure shows the common method to
find out the possible hypothesis from the Hypothesis
space:
• There are some common methods given to find out
the possible hypothesis from the Hypothesis space,
where hypothesis space is represented by uppercase-
h (H) and hypothesis by lowercase-h (h). These are
defined as follows:
Hypothesis space (H):
• Hypothesis space is defined as a set of all
possible legal hypotheses; hence it is also known
as a hypothesis set. It is used by supervised
machine learning algorithms to determine the best
possible hypothesis to describe the target function or
best maps input to output.
• It is often constrained by choice of the framing of
the problem, the choice of model, and the choice of
model configuration
Hypothesis (h):
• It is defined as the approximate function that best describes the
target in supervised machine learning algorithms. It is
primarily based on data as well as bias and
restrictions applied to data.
• Hence hypothesis (h) can be concluded as a single
hypothesis that maps input to proper output and can
be evaluated as well as used to make predictions.
• The hypothesis (h) can be formulated in machine learning
as follows:
y= mx + b
Where,
• Y: Range
• m: Slope of the line which divided test data or changes in y
divided by change in x.
• x: domain
• c: intercept (constant)
Example: Let's understand the hypothesis (h) and hypothesis
space (H) with a two-dimensional coordinate plane showing the
distribution of data as follows:
Now, assume we have some test data by which
ML algorithms predict the outputs for input as
follows:
If we divide this coordinate plane in such as way that it
can help you to predict output or result as follows:
Based on the given test data, the output result
will be as follows:
However, based on data, algorithm, and constraints,
this coordinate plane can also be divided in the
following ways as follows:
With the above example, we can conclude that;
• Hypothesis space (H) is the composition of all legal
best possible ways to divide the coordinate plane so
that it best maps input to proper output.
• Further, each individual best possible way is called a
hypothesis (h). Hence, the hypothesis and hypothesis
space would be like this:
Version Spaces
• A version space is a hierarchical representation of
knowledge that enables you to keep track of all the
useful information supplied by a sequence of
learning examples without remembering any of the
examples.
• The version space method is a concept learning
process accomplished by managing multiple
models within a version space.
• A hypothesis “h” is consistent with a set of
training examples D of target concept c if and
only if h(x) = c(x) for each training example in
D.
• The version space VS with respect to
hypothesis space H and training examples D
is the subset of hypothesis from H consistent
with all training examples in D.
Version Space Characteristics
• A version space represents all the alternative
plausible descriptions of a heuristic.
• A plausible description is one that is applicable to
all known positive examples and no known
negative example.
A version space description consists of two
complementary trees:
1.One that contains nodes connected to
overly general models, and
2.One that contains nodes connected to
overly specific models.
Diagrammatical Guidelines
• There is a generalization tree and
a specialization tree.
• Each node is connected to a model.
• Nodes in the generalization tree are connected to a
model that matches everything in its subtree.
• Nodes in the specialization tree are connected to a
model that matches only one thing in its subtree.
Links between nodes and their models denote
• generalization relations in a generalization tree, and
• specialization relations in a specialization tree.
Diagram of a Version Space
the specialization tree is colored red, and the generalization tree
is colored green.
Generalization and Specialization
Leads to Version Space
Convergence
• The key idea in version space learning is that
specialization of the general models and
generalization of the specific models may
ultimately lead to just one correct model that
matches all observed positive examples and does
not match any negative examples.
Version Space Method Learning
Algorithm: Candidate-
Elimination
• The Candidate Elimination Algorithm computes the
version space containing all hypotheses from H that are
consistent with an observed sequence of training
examples.
• It begins by initializing the version space to the set of all
hypotheses in H, that is, by initializing the G boundary
set to contain the most general hypotheses in H
• G0 ← {<?,?,?,?,?,?,?>}
• And initializing the S boundary set to contain the most
specific hypothesis.
• S0 ← {<0,0,0,0,0,0,0>}
• These two boundary sets delimit the entire hypothesis space
because every other hypothesis in H is both more general than
S0 and more specific than G0.
• As each training example is considered, the S and G boundary sets
are generalized and specialized, respectively to eliminate from the
version space any hypothesis found inconsistent with the new
training example.
• After all the examples have been processed, the computed version
space contains all the hypotheses consistent with these examples
and hypotheses.
The Candidate Elimination Algorithm goes as follows -
1.Initialize G to the set of maximally general hypotheses in H.
2.Initialize S to the set of maximally specific hypotheses in H.
3.For each training example d
1. If d is a positive example
2. Remove from G any hypothesis that does not include.
3. For each hypothesis s in S that does not include d, remove s from S.
4. Add to S all minimal generalizations h of s such that h includes d, and
5. Some member of G is more general than h
1. Remove from S any hypothesis that is more general than another
hypothesis in S.
4.For each training example d
1. If d is a negative example
2. Remove from S any hypothesis that does not include.
3. For each hypothesis g in G that does not include d
4. Remove g from G
5.Add to G all minimal generalizations h of g such that
1. h does not include d and
2. Some member of S is more specific than h
6.Remove from G any hypothesis that is less general than another hypothesis in
G.
7.If G or S, ever becomes empty, data not consistent (with H).
Advantages of the version space method:
• Can describe all the possible hypotheses in the language
consistent with the data.
• Fast (close to linear).
Disadvantages of the version space method:
• Inconsistent data (noise) may cause the target concept to
be pruned.
• Learning disjunctive concepts is challenging.
Example 2
Size Colour Shape Class/label
Big Red Circle No
Small Red Triangle No
Small Red Circle Yes
Big Blue Circle NO
Small Blue Circle Yes
Find-S Algorithm
Find maximally specific
hypothesis
Performance Metrics
Evaluating the performance of a Machine learning model
is one of the important steps while building an effective ML
model. To evaluate the performance or quality of the
model, different metrics are used, and these metrics
are known as performance metrics or evaluation
metrics.
These performance metrics help us understand how well
our model has performed for the given data. In this way,
we can improve the model's performance by tuning the
hyper-parameters. Each ML model aims to generalize well
on unseen/new data, and performance metrics help
determine how well the model generalizes on the new
dataset.
• In machine learning, each task or problem is
divided into classification and Regression.
Not all metrics can be used for all types of
problems; hence, it is important to know and
understand which metrics should be used.
• Different evaluation metrics are used for both
Regression and Classification tasks. In this
topic, we will discuss metrics used for
classification and regression tasks.
Performance Metrics for
Classification
In a classification problem, the category
or classes of data is identified based on training
data. The model learns from the given dataset
and then classifies the new data into classes or
groups based on the training. It predicts class
labels as the output, such as Yes or No, 0 or 1,
Spam or Not Spam, etc. To evaluate the
performance of a classification model, different
metrics are used, and some of them are as
follows:
• Accuracy
• Confusion Matrix
• Precision
• Recall
• F-Score
• AUC(Area Under the Curve)-ROC
I. Accuracy
The accuracy metric is one of the simplest
Classification metrics to implement, and it can be
determined as the number of correct predictions
to the total number of predictions.
It can be formulated as:
II. Confusion Matrix
• A confusion matrix is a tabular representation of
prediction outcomes of any binary classifier, which is
used to describe the performance of the classification
model on a set of test data when true values are known.
• The confusion matrix is simple to implement, but the
terminologies used in this matrix might be confusing for
beginners.
• A typical confusion matrix for a binary classifier looks
like the below image(However, it can be extended to
use for classifiers with more than two classes).
We can determine the following
from the above matrix:
• In the matrix, columns are for the prediction values, and
rows specify the Actual values. Here Actual and
prediction give two possible classes, Yes or No. So, if
we are predicting the presence of a disease in a patient,
the Prediction column with Yes means, Patient has the
disease, and for NO, the Patient doesn't have the
disease.
• In this example, the total number of predictions are 165,
out of which 110 time predicted yes, whereas 55 times
predicted No.
• However, in reality, 60 cases in which patients don't
have the disease, whereas 105 cases in which patients
have the disease.
In general, the table is divided
into four terminologies, which
are as follows:
1.True Positive(TP): In this case, the prediction
outcome is true, and it is true in reality, also.
2.True Negative(TN): in this case, the prediction
outcome is false, and it is false in reality, also.
3.False Positive(FP): In this case, prediction
outcomes are true, but they are false in
actuality.
4.False Negative(FN): In this case, predictions
are false, and they are true in actuality.
III. Precision
The precision metric is used to overcome
the limitation of Accuracy. The precision
determines the proportion of positive prediction
that was actually correct. It can be calculated as
the True Positive or predictions that are actually
true to the total positive predictions (True
Positive and False Positive).
IV. Recall or Sensitivity
It is also similar to the Precision metric; however, it
aims to calculate the proportion of actual positive that was
identified incorrectly. It can be calculated as True Positive or
predictions that are actually true to the total number of
positives, either correctly predicted as positive or incorrectly
predicted as negative (true Positive and false negative).
The formula for calculating Recall is given below:
• Specificity
Specificity, in contrast to recall, may be defined as the number of
negatives returned by our ML model. We can easily calculate it by
confusion matrix with the help of following formula −
V. F-Scores
• F-score or F1 Score is a metric to evaluate a
binary classification model on the basis of
predictions that are made for the positive class. It
is calculated with the help of Precision and Recall.
It is a type of single score that represents both
Precision and Recall. So, the F1 Score can be
calculated as the harmonic mean of both
precision and Recall, assigning equal weight to
each of them.
The formula for calculating the F1 score is given
below:
• VI. AUC-ROC
Sometimes we need to visualize the
performance of the classification model on
charts; then, we can use the AUC-ROC curve. It
is one of the popular and important metrics for
evaluating the performance of the classification
model.
• Firstly, let's understand ROC (Receiver
Operating Characteristic curve) curve. ROC
represents a graph to show the
performance of a classification model at
different threshold levels. The curve is
plotted between two parameters, which are:
• True Positive Rate
• False Positive Rate
TPR or true Positive rate is a synonym for
Recall, hence can be calculated as:
FPR or False Positive Rate can be
calculated as:
To calculate value at any point in a ROC curve,
we can evaluate a logistic regression model
multiple times with different classification
thresholds, but this would not be much efficient.
So, for this, one efficient method is used, which
is known as AUC.
AUC: Area Under the ROC
curve
• AUC is known for Area Under the ROC
curve. As its name suggests, AUC calculates
the two-dimensional area under the entire
ROC curve, as shown below image:
• AUC calculates the performance across all the
thresholds and provides an aggregate
measure. The value of AUC ranges from 0 to
1. It means a model with 100% wrong
prediction will have an AUC of 0.0, whereas
models with 100% correct predictions will have
an AUC of 1.0.

More Related Content

Similar to Unit 1-ML (1) (1).pptx

Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1Amruta Aphale
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTanvir Moin
 
Lecture-1-Introduction to Deep learning.pptx
Lecture-1-Introduction to Deep learning.pptxLecture-1-Introduction to Deep learning.pptx
Lecture-1-Introduction to Deep learning.pptxJayChauhan100
 
unit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxunit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxDr.Shweta
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)SwatiTripathi44
 
Introduction to Machine Learning.pptx
Introduction to Machine Learning.pptxIntroduction to Machine Learning.pptx
Introduction to Machine Learning.pptxDr. Amanpreet Kaur
 
Machine learning is the new BI
Machine learning is the new BIMachine learning is the new BI
Machine learning is the new BICycloides
 
Machine learning by prity mahato
Machine learning by prity mahatoMachine learning by prity mahato
Machine learning by prity mahatoPrity Mahato
 
Introduction to data science.pdf
Introduction to data science.pdfIntroduction to data science.pdf
Introduction to data science.pdfalsaid fathy
 
It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!To Sum It Up
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsArpana Awasthi
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfTemok IT Services
 
Introduction to machine learning and deep learning
Introduction to machine learning and deep learningIntroduction to machine learning and deep learning
Introduction to machine learning and deep learningShishir Choudhary
 
H2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark LandryH2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark LandrySri Ambati
 
Machine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopCCG
 

Similar to Unit 1-ML (1) (1).pptx (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
ML_Module_1.pdf
ML_Module_1.pdfML_Module_1.pdf
ML_Module_1.pdf
 
Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike Moin
 
Lecture-1-Introduction to Deep learning.pptx
Lecture-1-Introduction to Deep learning.pptxLecture-1-Introduction to Deep learning.pptx
Lecture-1-Introduction to Deep learning.pptx
 
unit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptxunit 1.2 supervised learning.pptx
unit 1.2 supervised learning.pptx
 
Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
 
Introduction to Machine Learning.pptx
Introduction to Machine Learning.pptxIntroduction to Machine Learning.pptx
Introduction to Machine Learning.pptx
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning is the new BI
Machine learning is the new BIMachine learning is the new BI
Machine learning is the new BI
 
Machine learning by prity mahato
Machine learning by prity mahatoMachine learning by prity mahato
Machine learning by prity mahato
 
Introduction to data science.pdf
Introduction to data science.pdfIntroduction to data science.pdf
Introduction to data science.pdf
 
Machine learning
Machine learningMachine learning
Machine learning
 
It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdf
 
Introduction to machine learning and deep learning
Introduction to machine learning and deep learningIntroduction to machine learning and deep learning
Introduction to machine learning and deep learning
 
H2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark LandryH2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark Landry
 
Machine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual Workshop
 

Recently uploaded

EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 

Recently uploaded (20)

EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 

Unit 1-ML (1) (1).pptx

  • 2. Introduction • Machine Learning or ML is one of the most successful applications of Artificial intelligence which provides systems with automated learning without being constantly programmed. • It has acquired a ton of noticeable quality lately due to its capacity to be applied across scores of ventures to tackle complex issues quickly and effectively.
  • 3. • From Digital assistants that play your music to the products being recommended based on prior search, Machine Learning has taken over many aspects of our life. • It is a skill in high demand as companies require software that can grasp data and provide accurate results. The core objective is to obtain optimal functions with less confusion.
  • 4. WHAT IS MACHINE LEARNING? • Machine Learning is a segment that comes under Artificial Intelligence (AI) that increases the quality of applications by using previously assimilated data. It programs systems to learn and grasp data without having to feed a new code for every new similar activity. • The aim is for the flow to be automated rather than continuously modified. Hence by experience and past intel, it improves the program by itself.
  • 5. WHY MACHINE LEARNING? • The domain of Machine Learning is a continuously evolving field with high demand. Without human intervention, it delivers real-time results using the already existing and processed data. • It generally helps analyze and assess large amounts of data with ease by developing data-driven models. As of today, Machine Learning has become a fast and efficient way for firms to build models and strategize plans.
  • 6. ADVANTAGES OF MACHINE LEARNING • Completely Automated ( Zero human intervention) • Analyses large amounts of data • More efficient than traditional data analytical methods • Identifies trends and patterns with ease • Reliable and efficient • Less usage of workforce • Handles a variety of data • Accommodates for most forms of applications
  • 7. COMMONLY USED ALGORITHMS IN MACHINE LEARNING There are many different models in Machine Learning. Here are the most commonly used algorithms in the world today- • Gradient Boosting algorithms dimensionality Reduction Algorithms • Random Forest • K-Means • KNN • Naive Bayes • SVM • Decision Tree • Logistic Regression • Linear Regression
  • 8. • Machine Learning or ML is slowly but steadily having a huge impact on data-driven business decisions across the globe. It has also helped organizations with the correct intel to make more informed, data- driven choices that are quicker than conventional methodologies. • Yet, there are many issues in Machine Learning that cannot be overlooked in spite of its high productivity.
  • 9. Machine Learning : A General Perspective • The goal in the machine learning is to recognize the pattern in the dataset, in general manner. After you recognize the patterns, you can use this information to model the data, to interpret the data, or to predict the outcome of the new data which hasn’t seen before.
  • 10. • Machine learning is a subfield of artificial intelligence and machine learning algorithms are used in other related fields like natural language processing and computer vision. • In general, there are three types of learning and these are • supervised learning, • unsupervised learning, and • reinforcement learning. Their names tell the main idea behind them actually.
  • 11. Supervised learning In supervised learning, your system learns under the supervision of the data outputs so supervised algorithms are preferred if your dataset contains output information. Let me give you an example in there. • Let’s assume you have a medical statistic company and you have a dataset which contains patients’ features like blood pressure, sugar rate in their blood, heart rate per minute, etc. • and also you have the information about if they have experienced heart disease in their life or not.
  • 12. • By training a machine learning algorithm, your system can find a pattern between features and the probability to experience heart disease. Therefore your algorithm can predict whether a new patient has a risk to experience a heart disease, so doctor takes the precautions and save a person’s life
  • 13. A Decision Tree from one of projects. Where x’s are features which are medical tests in this case and the 0,1 values in boxes represents existence of heart disease. As you can see, algorithm produced an interpretable tree.
  • 14. Unsupervised Learning • In unsupervised algorithms if your data doesn’t contain output and if you would like to discover the clusters in dataset. • A good example of unsupervised learning is handwritten digit recognition. • In this application you know that there should be 10 clusters {0,1,2,3,4,5,6,7,8,9} but the problem in handwritten digits is that there are countless ways to write a digit by hand, and everyone write digits differently.
  • 15. • How does a computer understand what is written with hand? • In there, you should use an unsupervised algorithm like K- means or EM-algorithm. • What you do with these algorithms is that you start with initial random cluster means and iteratively these mean points converge to real cluster mean values. After you complete the training, if you visualize the means of the clusters you can see that they really look like digits. Then you label these clusters with corresponding digits, and when the computer encounters a new handwritten digit, algorithm labels the digit with the mean which is closest to it.
  • 16.
  • 17. Reinforcement learning Let’s assume you want to create an intelligent agent which plays chess. In chess, you can’t handle movements one by one. Your agent should consider a series of movements and then decide to take an action which would maximize the utility. Therefore your agent should play a couple of turns against itself and decide the best action to take. We call this type of learning as reinforcement learning and it is generally used in games.
  • 18.
  • 19. • Poor Quality of Data • Under fitting of Training Data • Over fitting of Training Data • Machine Learning is a Complex Process • Lack of Training Data • Slow Implementation • Imperfections in the Algorithm When Data Grows
  • 20. 1. Poor Quality of Data • Data plays a significant role in the machine learning process. One of the significant issues that machine learning professionals face is the absence of good quality data. Unclean and noisy data can make the whole process extremely exhausting. We don’t want our algorithm to make inaccurate or faulty predictions. • Hence the quality of data is essential to enhance the output. Therefore, we need to ensure that the process of data preprocessing which includes removing outliers, filtering missing values, and removing unwanted features, is done with the utmost level of perfection.
  • 21. 2.Underfitting of Training Data This process occurs when data is unable to establish an accurate relationship between input and output variables. It simply means trying to fit in undersized jeans. It signifies the data is too simple to establish a precise relationship. To overcome this issue: • Maximize the training time • Enhance the complexity of the model • Add more features to the data • Reduce regular parameters • Increasing the training time of model
  • 22. 3. Overfitting of Training Data • Overfitting refers to a machine learning model trained with a massive amount of data that negatively affect its performance. It is like trying to fit in Oversized jeans. Unfortunately, this is one of the significant issues faced by machine learning professionals. This means that the algorithm is trained with noisy and biased data, which will affect its overall performance
  • 23. • . Let’s understand this with the help of an example. Let’s consider a model trained to differentiate between a cat, a rabbit, a dog, and a tiger. The training data contains 1000 cats, 1000 dogs, 1000 tigers, and 4000 Rabbits. Then there is a considerable probability that it will identify the cat as a rabbit. In this example, we had a vast amount of data, but it was biased; hence the prediction was negatively affected.
  • 24. 4. Machine Learning is a Complex Process • The machine learning industry is young and is continuously changing. Rapid hit and trial experiments are being carried on. The process is transforming, and hence there are high chances of error which makes the learning complex. • It includes analyzing the data, removing data bias, training data, applying complex mathematical calculations, and a lot more. Hence it is a really complicated process which is another big challenge for Machine learning professionals
  • 25. 5. Lack of Training Data • The most important task you need to do in the machine learning process is to train the data to achieve an accurate output. Less amount training data will produce inaccurate or too biased predictions. • Let us understand this with the help of an example.
  • 26. • Consider a machine learning algorithm similar to training a child. One day you decided to explain to a child how to distinguish between an apple and a watermelon. You will take an apple and a watermelon and show him the difference between both based on their color, shape, and taste. • In this way, soon, he will attain perfection in differentiating between the two. • But on the other hand, a machine-learning algorithm needs a lot of data to distinguish. For complex problems, it may even require millions of data to be trained. Therefore we need to ensure that Machine learning algorithms are trained with sufficient amounts of data.
  • 27. 6. Slow Implementation • This is one of the common issues faced by machine learning professionals. The machine learning models are highly efficient in providing accurate results, but it takes a tremendous amount of time. • Slow programs, data overload, and excessive requirements usually take a lot of time to provide accurate results. Further, it requires constant monitoring and maintenance to deliver the best output.
  • 28. 7. Imperfections in the Algorithm When Data Grows • The model may become useless in the future as data grows. The best model of the present may become inaccurate in the coming Future and require further rearrangement. So you need regular monitoring and maintenance to keep the algorithm working. This is one of the most exhausting issues faced by machine learning professionals.
  • 29. Conclusion: • It is one of the most rapidly growing technologies used in medical diagnosis, speech recognition, robotic training, product recommendations, video surveillance, and this list goes on
  • 30. Design a Learning System in Machine Learning According to Arthur Samuel “Machine Learning enables a Machine to Automatically learn from Data, Improve performance from an Experience and predict things without explicitly programmed.”
  • 31. • In Simple Words, When we fed the Training Data to Machine Learning Algorithm, this algorithm will produce a mathematical model and with the help of the mathematical model, the machine will make a prediction and take a decision without being explicitly programmed. • Also, during training data, the more machine will work with it the more it will get experience and the more efficient result is produced.
  • 32.
  • 33. Example : • In Driverless Car, the training data is fed to Algorithm like how to Drive Car in Highway, Busy and Narrow Street with factors like speed limit, parking, stop at signal etc. • After that, a Logical and Mathematical model is created on the basis of that and after that, the car will work according to the logical model. Also, the more data the data is fed the more efficient output is produced
  • 34. • According to Tom Mitchell, “A computer program is said to be learning from experience (E), with respect to some task (T). Thus, the performance measure (P) is the performance at task T, which is measured by P, and it improves with experience E.”
  • 35. Example: In Spam E-Mail detection, • Task, T: To classify mails into Spam or Not Spam. • Performance measure, P: Total percent of mails being correctly classified as being “Spam” or “Not Spam”. • Experience, E: Set of Mails with label “Spam”
  • 36. Steps for Designing Learning System are:
  • 37. Step 1) Choosing the Training Experience: • The very important and first task is to choose the training data or training experience which will be fed to the Machine Learning Algorithm. It is important to note that the data or experience that we fed to the algorithm must have a significant impact on the Success or Failure of the Model. So Training data or experience should be chosen wisely.
  • 38. Below are the attributes which will impact on Success and Failure of Data • The training experience will be able to provide direct or indirect feedback regarding choices. For example: While Playing chess the training data will provide feedback to itself like instead of this move if this is chosen the chances of success increases.
  • 39. • Second important attribute is the degree to which the learner will control the sequences of training examples. For example: when training data is fed to the machine then at that time accuracy is very less but when it gains experience while playing again and again with itself or opponent the machine algorithm will get feedback and control the chess game accordingly.
  • 40. • Third important attribute is how it will represent the distribution of examples over which performance will be measured. For example, a Machine learning algorithm will get experience while going through a number of different cases and different examples. Thus, Machine Learning Algorithm will get more and more experience by passing through more and more examples and hence its performance will increase.
  • 41. Step 2- Choosing target function: • The next important step is choosing the target function. It means according to the knowledge fed to the algorithm the machine learning will choose NextMove function which will describe what type of legal moves should be taken.
  • 42. • For example : While playing chess with the opponent, when opponent will play then the machine learning algorithm will decide what be the number of possible legal moves taken in order to get success.
  • 43. Step 3- Choosing Representation for Target function: • When the machine algorithm will know all the possible legal moves the next step is to choose the optimized move using any representation i.e. using linear Equations, Hierarchical Graph Representation, Tabular form etc. The NextMove function will move the Target move like out of these move which will provide more success rate.
  • 44. • For Example : while playing chess machine have 4 possible moves, so the machine will choose that optimized move which will provide success to it.
  • 45. Step 4- Choosing Function Approximation Algorithm: • An optimized move cannot be chosen just with the training data. The training data had to go through with set of example and through these examples the training data will approximates which steps are chosen and after that machine will provide feedback on it.
  • 46. • For Example : When a training data of Playing chess is fed to algorithm so at that time it is not machine algorithm will fail or get success and again from that failure or success it will measure while next move what step should be chosen and what is its success rate.
  • 47. Step 5- Final Design: • The final design is created at last when system goes from number of examples , failures and success , correct and incorrect decision and what will be the next step etc. Example: DeepBlue is an intelligent computer which is ML-based won chess game against the chess expert Garry Kasparov, and it became the first computer which had beaten a human chess expert.
  • 48.
  • 49. Concept of Hypothesis • The hypothesis is a common term in Machine Learning and data science projects. As we know, machine learning is one of the most powerful technologies across the world, which helps us to predict results based on past experiences. • Moreover, data scientists and ML professionals conduct experiments that aim to solve a problem. These ML professionals and data scientists make an initial assumption for the solution of the problem. • This assumption in Machine learning is known as Hypothesis. • In Machine Learning, at various times, Hypothesis and Model are used interchangeably. However, a Hypothesis is an assumption made by scientists, whereas a model is a mathematical representation that is used to test the hypothesis
  • 50. What is Hypothesis? • The hypothesis is defined as the supposition or proposed explanation based on insufficient evidence or assumptions. It is just a guess based on some known facts but has not yet been proven. A good hypothesis is testable, which results in either true or false. • Example: Let's understand the hypothesis with a common example. Some scientist claims that ultraviolet (UV) light can damage the eyes then it may also cause blindness. • In this example, a scientist just claims that UV rays are harmful to the eyes, but we assume they may cause blindness. However, it may or may not be possible. Hence, these types of assumptions are called a hypothesis.
  • 51. Two important types of hypotheses as follows: • Null Hypothesis: A null hypothesis is a type of statistical hypothesis which tells that there is no statistically significant effect exists in the given set of observations. It is also known as conjecture and is used in quantitative analysis to test theories about markets, investment, and finance to decide whether an idea is true or false.
  • 52. Alternative Hypothesis: • An alternative hypothesis is a direct contradiction of the null hypothesis, which means if one of the two hypotheses is true, then the other must be false. In other words, an alternative hypothesis is a type of statistical hypothesis which tells that there is some significant effect that exists in the given set of observations
  • 53. Hypothesis in Machine Learning (ML) • The hypothesis is one of the commonly used concepts of statistics in Machine Learning. It is specifically used in Supervised Machine learning, where an ML model learns a function that best maps the input to corresponding outputs with the help of an available dataset.
  • 54. The following figure shows the common method to find out the possible hypothesis from the Hypothesis space:
  • 55. • There are some common methods given to find out the possible hypothesis from the Hypothesis space, where hypothesis space is represented by uppercase- h (H) and hypothesis by lowercase-h (h). These are defined as follows:
  • 56. Hypothesis space (H): • Hypothesis space is defined as a set of all possible legal hypotheses; hence it is also known as a hypothesis set. It is used by supervised machine learning algorithms to determine the best possible hypothesis to describe the target function or best maps input to output. • It is often constrained by choice of the framing of the problem, the choice of model, and the choice of model configuration
  • 57. Hypothesis (h): • It is defined as the approximate function that best describes the target in supervised machine learning algorithms. It is primarily based on data as well as bias and restrictions applied to data. • Hence hypothesis (h) can be concluded as a single hypothesis that maps input to proper output and can be evaluated as well as used to make predictions.
  • 58. • The hypothesis (h) can be formulated in machine learning as follows: y= mx + b Where, • Y: Range • m: Slope of the line which divided test data or changes in y divided by change in x. • x: domain • c: intercept (constant)
  • 59. Example: Let's understand the hypothesis (h) and hypothesis space (H) with a two-dimensional coordinate plane showing the distribution of data as follows:
  • 60. Now, assume we have some test data by which ML algorithms predict the outputs for input as follows:
  • 61. If we divide this coordinate plane in such as way that it can help you to predict output or result as follows:
  • 62. Based on the given test data, the output result will be as follows:
  • 63. However, based on data, algorithm, and constraints, this coordinate plane can also be divided in the following ways as follows:
  • 64. With the above example, we can conclude that; • Hypothesis space (H) is the composition of all legal best possible ways to divide the coordinate plane so that it best maps input to proper output. • Further, each individual best possible way is called a hypothesis (h). Hence, the hypothesis and hypothesis space would be like this:
  • 65.
  • 66. Version Spaces • A version space is a hierarchical representation of knowledge that enables you to keep track of all the useful information supplied by a sequence of learning examples without remembering any of the examples. • The version space method is a concept learning process accomplished by managing multiple models within a version space.
  • 67. • A hypothesis “h” is consistent with a set of training examples D of target concept c if and only if h(x) = c(x) for each training example in D. • The version space VS with respect to hypothesis space H and training examples D is the subset of hypothesis from H consistent with all training examples in D.
  • 68.
  • 69. Version Space Characteristics • A version space represents all the alternative plausible descriptions of a heuristic. • A plausible description is one that is applicable to all known positive examples and no known negative example.
  • 70. A version space description consists of two complementary trees: 1.One that contains nodes connected to overly general models, and 2.One that contains nodes connected to overly specific models.
  • 71. Diagrammatical Guidelines • There is a generalization tree and a specialization tree. • Each node is connected to a model. • Nodes in the generalization tree are connected to a model that matches everything in its subtree. • Nodes in the specialization tree are connected to a model that matches only one thing in its subtree.
  • 72. Links between nodes and their models denote • generalization relations in a generalization tree, and • specialization relations in a specialization tree.
  • 73. Diagram of a Version Space the specialization tree is colored red, and the generalization tree is colored green.
  • 74. Generalization and Specialization Leads to Version Space Convergence • The key idea in version space learning is that specialization of the general models and generalization of the specific models may ultimately lead to just one correct model that matches all observed positive examples and does not match any negative examples.
  • 75. Version Space Method Learning Algorithm: Candidate- Elimination • The Candidate Elimination Algorithm computes the version space containing all hypotheses from H that are consistent with an observed sequence of training examples. • It begins by initializing the version space to the set of all hypotheses in H, that is, by initializing the G boundary set to contain the most general hypotheses in H • G0 ← {<?,?,?,?,?,?,?>} • And initializing the S boundary set to contain the most specific hypothesis. • S0 ← {<0,0,0,0,0,0,0>}
  • 76. • These two boundary sets delimit the entire hypothesis space because every other hypothesis in H is both more general than S0 and more specific than G0. • As each training example is considered, the S and G boundary sets are generalized and specialized, respectively to eliminate from the version space any hypothesis found inconsistent with the new training example. • After all the examples have been processed, the computed version space contains all the hypotheses consistent with these examples and hypotheses.
  • 77. The Candidate Elimination Algorithm goes as follows - 1.Initialize G to the set of maximally general hypotheses in H. 2.Initialize S to the set of maximally specific hypotheses in H. 3.For each training example d 1. If d is a positive example 2. Remove from G any hypothesis that does not include. 3. For each hypothesis s in S that does not include d, remove s from S. 4. Add to S all minimal generalizations h of s such that h includes d, and 5. Some member of G is more general than h 1. Remove from S any hypothesis that is more general than another hypothesis in S. 4.For each training example d 1. If d is a negative example 2. Remove from S any hypothesis that does not include. 3. For each hypothesis g in G that does not include d 4. Remove g from G 5.Add to G all minimal generalizations h of g such that 1. h does not include d and 2. Some member of S is more specific than h 6.Remove from G any hypothesis that is less general than another hypothesis in G. 7.If G or S, ever becomes empty, data not consistent (with H).
  • 78. Advantages of the version space method: • Can describe all the possible hypotheses in the language consistent with the data. • Fast (close to linear). Disadvantages of the version space method: • Inconsistent data (noise) may cause the target concept to be pruned. • Learning disjunctive concepts is challenging.
  • 79.
  • 80.
  • 81.
  • 82. Example 2 Size Colour Shape Class/label Big Red Circle No Small Red Triangle No Small Red Circle Yes Big Blue Circle NO Small Blue Circle Yes
  • 83.
  • 84.
  • 85.
  • 86. Find-S Algorithm Find maximally specific hypothesis
  • 87.
  • 88.
  • 89.
  • 90.
  • 91.
  • 92.
  • 93. Performance Metrics Evaluating the performance of a Machine learning model is one of the important steps while building an effective ML model. To evaluate the performance or quality of the model, different metrics are used, and these metrics are known as performance metrics or evaluation metrics. These performance metrics help us understand how well our model has performed for the given data. In this way, we can improve the model's performance by tuning the hyper-parameters. Each ML model aims to generalize well on unseen/new data, and performance metrics help determine how well the model generalizes on the new dataset.
  • 94. • In machine learning, each task or problem is divided into classification and Regression. Not all metrics can be used for all types of problems; hence, it is important to know and understand which metrics should be used. • Different evaluation metrics are used for both Regression and Classification tasks. In this topic, we will discuss metrics used for classification and regression tasks.
  • 95.
  • 96. Performance Metrics for Classification In a classification problem, the category or classes of data is identified based on training data. The model learns from the given dataset and then classifies the new data into classes or groups based on the training. It predicts class labels as the output, such as Yes or No, 0 or 1, Spam or Not Spam, etc. To evaluate the performance of a classification model, different metrics are used, and some of them are as follows:
  • 97. • Accuracy • Confusion Matrix • Precision • Recall • F-Score • AUC(Area Under the Curve)-ROC
  • 98. I. Accuracy The accuracy metric is one of the simplest Classification metrics to implement, and it can be determined as the number of correct predictions to the total number of predictions. It can be formulated as:
  • 99. II. Confusion Matrix • A confusion matrix is a tabular representation of prediction outcomes of any binary classifier, which is used to describe the performance of the classification model on a set of test data when true values are known. • The confusion matrix is simple to implement, but the terminologies used in this matrix might be confusing for beginners. • A typical confusion matrix for a binary classifier looks like the below image(However, it can be extended to use for classifiers with more than two classes).
  • 100.
  • 101. We can determine the following from the above matrix: • In the matrix, columns are for the prediction values, and rows specify the Actual values. Here Actual and prediction give two possible classes, Yes or No. So, if we are predicting the presence of a disease in a patient, the Prediction column with Yes means, Patient has the disease, and for NO, the Patient doesn't have the disease. • In this example, the total number of predictions are 165, out of which 110 time predicted yes, whereas 55 times predicted No. • However, in reality, 60 cases in which patients don't have the disease, whereas 105 cases in which patients have the disease.
  • 102. In general, the table is divided into four terminologies, which are as follows: 1.True Positive(TP): In this case, the prediction outcome is true, and it is true in reality, also. 2.True Negative(TN): in this case, the prediction outcome is false, and it is false in reality, also. 3.False Positive(FP): In this case, prediction outcomes are true, but they are false in actuality. 4.False Negative(FN): In this case, predictions are false, and they are true in actuality.
  • 103. III. Precision The precision metric is used to overcome the limitation of Accuracy. The precision determines the proportion of positive prediction that was actually correct. It can be calculated as the True Positive or predictions that are actually true to the total positive predictions (True Positive and False Positive).
  • 104. IV. Recall or Sensitivity It is also similar to the Precision metric; however, it aims to calculate the proportion of actual positive that was identified incorrectly. It can be calculated as True Positive or predictions that are actually true to the total number of positives, either correctly predicted as positive or incorrectly predicted as negative (true Positive and false negative). The formula for calculating Recall is given below:
  • 105. • Specificity Specificity, in contrast to recall, may be defined as the number of negatives returned by our ML model. We can easily calculate it by confusion matrix with the help of following formula −
  • 106. V. F-Scores • F-score or F1 Score is a metric to evaluate a binary classification model on the basis of predictions that are made for the positive class. It is calculated with the help of Precision and Recall. It is a type of single score that represents both Precision and Recall. So, the F1 Score can be calculated as the harmonic mean of both precision and Recall, assigning equal weight to each of them. The formula for calculating the F1 score is given below:
  • 107. • VI. AUC-ROC Sometimes we need to visualize the performance of the classification model on charts; then, we can use the AUC-ROC curve. It is one of the popular and important metrics for evaluating the performance of the classification model.
  • 108. • Firstly, let's understand ROC (Receiver Operating Characteristic curve) curve. ROC represents a graph to show the performance of a classification model at different threshold levels. The curve is plotted between two parameters, which are: • True Positive Rate • False Positive Rate
  • 109. TPR or true Positive rate is a synonym for Recall, hence can be calculated as: FPR or False Positive Rate can be calculated as:
  • 110. To calculate value at any point in a ROC curve, we can evaluate a logistic regression model multiple times with different classification thresholds, but this would not be much efficient. So, for this, one efficient method is used, which is known as AUC.
  • 111. AUC: Area Under the ROC curve • AUC is known for Area Under the ROC curve. As its name suggests, AUC calculates the two-dimensional area under the entire ROC curve, as shown below image:
  • 112.
  • 113. • AUC calculates the performance across all the thresholds and provides an aggregate measure. The value of AUC ranges from 0 to 1. It means a model with 100% wrong prediction will have an AUC of 0.0, whereas models with 100% correct predictions will have an AUC of 1.0.