SlideShare a Scribd company logo
1 of 77
MACHINE LEARNING
What is MACHINE LEARNING ?....
 Not a well defined definition. But
 Arthur Samuel (1959):
Machine learning: "Field of study that gives computers the ability to learn
without being explicitly programmed"
 Samuel wrote a checkers playing program
 Had the program play 10000 games against itself
 Work out which board positions were good and bad depending on
wins/losses
Example
 Tom Michel (1999):
Well posed learning problem: "A computer program is said to learn from
experience E with respect to some class of tasks T and performance measure P, if
its performance at tasks in T, as measured by P, improves with experience E."
An other definition……
The checkers example,
 E = 10000s games
 T is playing checkers
 P if you win or loss
How machine learning WOrks
Is Machine Learning Magic?
No,
It is more like gardening.
Seeds = Algorithms
Nutrients = Data
Gardener = You
Plants = Programs
Types of MACHINE LEARNING
Supervised Learning (Train me)
 It is a data mining task of inferring a function from labeled training data.
 The training data consist of a set of training examples.
 In supervised learning, each example is a pair consisting of an input object (typically a
vector) and the desired output value (also called the supervisory signal)
Unsupervised Learning (I am self sufficient in learning)
 That learns from test data that has not been labeled, classified or categorized.
 Instead of responding to feedback, unsupervised learning identifies commonalities in
the data and reacts based on the presence or absence of such commonalities in each new
piece of data
Reinforcement Learning (My life My rules! (Hit & Trail))
 It is the ability of an agent to interact with the environment and find out what is the best
outcome. It follows the concept of hit and trial method.
 The agent is rewarded with a point for a correct or a wrong answer, and on the basis of
the positive reward points gained the model trains itself.
 Reinforcement learning differs from the supervised learning in a way that in supervised
learning, the training data has the answer key with it so the model is trained with the
correct answer itself whereas in reinforcement learning, there is no answer but the
reinforcement agent decides what to do to perform the given task
 In the absence of training dataset, it is bound to learn from its experience.
Supervised
Unsupervised
Learning
Reinforcemen
t
• Labeled data
• Direct feedback
• Predict outcome/future
• Decision Process
• Reward System
• Learn series of actions
• No labels
• No feedback
• Find hidden Structure
Supervised Learning Vs Unsupervised learning
Real Life example
Task is to arrange them as
groups
NO SIZE Colour Shape Fruit Name
1 Big Red
Rounded shape with a
depression at the top
Apple
2 Small Red
Heart-shaped to nearly
globular
Cherry
3 Big Green Long curving cylinder Banana
4 Small Green
Round to oval, Bunch
shape cylindrical
Grape
For Supervised Learning
 Already learn from previous work about the physical characters of fruits
 So arranging the same type of fruits at one place.
 Your previous work is called as training data in data mining
 You already learn the things from your train data, this is because of
response variable
 Response variable means just a decision variable
For Unsupervised Learning
 This time we don’t know any thing about the fruits, honestly saying this
is the first time you have seen them. You have no clue about those.
 So, how will we arrange them?
 What will we do first???
 We will take a fruit and you will arrange them by considering physical
character of that particular fruit.
 Suppose We have considered color
•RED COLOR GROUP: apples & cherry fruits.
•GREEN COLOR GROUP: bananas & grapes.
 Consider The Size along with previous consideration:
•RED COLOR AND BIG SIZE: apple.
•RED COLOR AND SMALL SIZE: cherry fruits.
•GREEN COLOR AND BIG SIZE: bananas.
•GREEN COLOR AND SMALL SIZE: grapes.
 This type of learning is known as unsupervised learning.
 Clustering comes under unsupervised learning.
Machine Learning Techniques
MACHINE LEARNING
SUPERVISED LEARNING UNSUPERVISED LEARNING
CLASSIFICATION
Nearest Neighbor
SVR, GPR’
Decision Trees
Neural Network
REGRESSION
Linear Regression GLM
Ensemble Methods
CLASSIFICATION
Support Vector Machines
Discriminant Analysis
Naïve Bayes
Hierarchical
Neural Networks
Hidden Markov Model
CLUSTERING
K-Means, K-Medoids, Fuzzy
C-Means
Gaussian Mixture
Selecting the Right Algorithm
 selecting a machine learning algorithm is a process of trial and error.
 It’s also a trade-off between specific characteristics of the algorithms,
such as:
 Speed of training
 Memory usage
 Predictive accuracy on new data
 Transparency or interpretability (how easily you can
understand the reasons an algorithm makes its predictions)
SUPERVISEDLEARNING
Classification Regression
Classification techniques predict
discrete responses
—for example,
whether an email is genuine or spam, or
whether a tumor is small,
medium, or large. Classification models
are trained to classify data into
categories. Applications include
medical imaging, speech
recognition, and credit scoring.
Regression techniques predict
continuous responses
—for example,
changes in temperature or fluctuations
in electricity demand.
Applications include forecasting stock
prices, handwriting recognition, and
acoustic signal processing
If the data can be separated into
specific groups or classes, use
classification algorithms.
If the nature of your response is a
real number –such as temperature,
or the time until failure for a piece
of equipment—use regression
techniques.
 Let’s take a closer look at the most commonly used
classification and regression algorithms.
Binaryvs. Multiclass Classification
When we are working on a classification problem, begin by determining whether
the problem is binary or multiclass.
BinaryClassification Multiclass Classification
A single training or test item (instance)
can only be divided into two classes
—for example, Determine whether an
email is genuine or spam
It can be divided into more than two
—for example, Train a model to classify
an image as a dog, cat, or other animal
It requires a more complex model
Binaryvs. Multiclass Classification
𝑋2
𝑋1
×
×
×
×
×
×
×
∆
∆
∆ ∆
∆
𝑋2
𝑋1
×
×
×
×
×
×
Binary Multiclass
Otherexamplesfor Classification
 Binary Classification
 Put a tennis ball into the Color or no-Color bin (color)
 (Medical Test) Determine if a patient has certain disease or not
 (Quality Control Test) Decide if a product should be sold or discarded
 (IR Test) Determine if a document should be in the search results or not
 Multi-Class Classification
 Put a tennis ball into the Green, Orange, or White ball bin (color)
 Decide if an email is advertisement, newsletter, phishing, hack, or
personal.
 Classify a document into Yahoo! Categories
 (Optical Recognition) Classify a scanned character into digit (0..9)
Support Vector Machine
 “Support Vector Machine” (SVM) is a supervised
machine learning algorithm which can be used
mostly in classification problems.
 In this algorithm, data plots as a points in n-
dimensional space (where n is number of
features) with the value of each feature being
the value of a particular coordinate.
 Then, classification perform by finding the
hyper-plane that differentiate the two classes
very well
hyper-plane
Margin
Margin
HowSupport vector machine Works
Classifies data by finding the linear decision boundary
(hyperplane) that separates all data points of one class
from those of the other class.
The best hyperplane for an SVM is the one with the
largest margin between the two classes, when the data is
linearly separable.
If the data is not linearly separable, a loss function is used
to penalize points on the wrong side of the hyperplane.
 SVMs sometimes use a kernel transform to transform nonlinearly separable data into
higher dimensions where a linear decision boundary can be found.
Identifythe right hyper-plane
 “Select the hyper-plane which segregates the two classes
better”.
 In this scenario, hyper-plane “B” has excellently performed
this job.
Which is the right hyper plane?
Which is the right hyper plane?
 Above, you can see that the margin for hyper-plane C is
high as compared to both A and B.
 Hence, we name the right hyper-plane as C.
Margin
Identifythe right hyper-plane
 SVM selects the hyper-plane which classifies the classes
accurately prior to maximizing margin.
 Here, hyper-plane B has a classification error and A has
classified all correctly.
 Therefore, the right hyper-plane is A.
Which is the right hyper plane?
Which is the right hyper plane?
 It solves this problem by introducing additional feature.
Here, we will add a new feature z=x^2+y^2. (Kernel
Transformation)
 Now, let’s plot the data points on axis x and z:
Support vector machine Best used ….
For data that has exactly two classes (you can also use it for multiclass classification with
a technique called error correcting output codes)
For high-dimensional, nonlinearly separable data
Pros and Cons associatedwithSVM
Pros:
 It works really well with clear margin of separation
 It is effective in high dimensionalspaces.
 It is effective in cases where number of dimensionsis greater than the number of samples.
 It uses a subset of training points in the decisionfunction (called support vectors), so it is also
memory efficient.
Cons:
 It doesn’t perform well, when we have large data set because the required training time is higher
 It also doesn’tperform very well, when the data set has more noise i.e. target classes are
overlapping
 SVM doesn’t directly provide probabilityestimates, these are calculated using an expensive five-fold
cross-validation.
Discriminant Analysis
Discriminant analysis (DA) is a technique for analyzing data when the criterion or
dependent variable is categorical and the predictor or independent variables are
interval in nature.
 It is a technique to discriminate between two or more mutually exclusive and
exhaustive groups on the basis of some explanatory variables
Types Discriminant Analysis (DA)
1. Linear D A - when the criterion /
dependent variable has two categories
Example: adopters & non-adopters
2. Multiple D A- when three or more
categories are involved
Example: SHG1, SHG2,SHG3
 Group sizes of the dependent should not be grossly different i.e. 80:20. It should be
at least five times the number of independent variables
How DA Works
Assumptions
1. Sample Size (n)
 Each of the independent variable is normally distributed.
2. Normal Distribution
 All variables have linear and homoscedastic relationships.
3. Homogeneity of variances / covariances
 Outliers should not be present in the data.
DA is highly sensitive to the inclusion of
outliers.
4. Outliers
 There should NOT BE MULTICOLLINEARITY
among the independent variables.
5. Non - multicolinearity
 The groups must be mutually exclusive, with every subject or case belonging to
only one group.
6. Mutually exclusive
 Each of the allocations for the dependent categories in the initial classification are
correctly classified.
7. Classification
Discriminant Analysis Model
 The discriminant analysis model involves linear combinations of the following
form
𝐷 = 𝑏0+𝑏1𝑋1 + 𝑏2𝑋2 + 𝑏3𝑋3 + ……….+ 𝑏𝑘𝑋𝑘
 where
 D = discriminant score
 b 's = discriminant coefficient or weight
 X 's = predictor or independent variable
 The coefficients, or weights (b), are estimated so that the groups differ as much
as possible on the values of the discriminant function.
Applications of Discriminant Analysis Model
 Discriminant analysis has been success fully used for many applications. As long
as we can transform the problem into a classification problem.
 DA can be used for original applications also
1. Identification
 TO identify type of customers that is likely to buy certain product in a store.
 Using simple questionnaires survey, we can get the features of customers
 DA will help to select which features can describe the group membership of
buy or not buy the product
3. Prediction
 Question “will it rain to day” can be thought as prediction.
 Prediction problem can be thought as assigning “today” to one of the two
possible groups of rain and dry
2. Decision Making
 Doctor diagnosing illness may be seen as which disease the patient has.
 This problem can be transform into classification problem by assigning the
patient to a number of possible groups of disease based on the Observation on
the symptoms
5. Learning
 Scientists want to teach robot to learn to talk can be seen as classification
problem.
 It assigns frequency , pitch, tune, and many other measurements of sound into
many groups of words
4. Pattern recognition
 To distinguish pedestrian from dogs and cars on capture image sequence of traffic
date is a classification problem
Naïve Bayes Model
 It is a classification technique based on Bayes theorem with an assumption of
independence among predictors.
 It is easy to build and particularly useful for very large datasets.
 It learns and predicts very fast and it does not require lots of storage.
 I has an Assumption : All features must be independent of each other
 It still returns very good accuracy in practice even when the independent
assumption does not hold
1. Real-time Prediction
Applications of Naïve Bayes Model
2. Multi - Class Prediction
3. Text Classification/ Spam Filtering/Sentiment Analysis
4. Recommendation System
Probability Basics
• Prior, conditional and joint probability for random variables
– Prior probability: P(x)
– Conditional probability: P(𝑥1|𝑥2), P(𝑥2|𝑥1)
– Relationship: P 𝑥1, 𝑥2 = 𝑃 𝑥2 𝑥1 𝑃 𝑥1 = 𝑃 𝑥1 𝑥2 𝑃 𝑥2
– Independence: )
(
)
(
)
),
(
)
|
(
),
(
)
|
( 2
1
2
1
2
1
2
1
2 x
P
x
P
,x
P(x
x
P
x
x
P
x
P
x
x
P 1 


)
(
)
(
)
(
)
(
x
x
x
P
c
P
c
|
P
|
c
P 
Discriminative Generative
Bayesian Rule
Event contains 2 boxes. Box 1 Contains 2 white balls and 3 red balls, Box 2
contains 4 white balls and 5 red balls. One ball is drawn at random from one of the
box and is found to be red. Find the probability that It was drawn from second
box.
Example to Understand Baye’s Theorem
Let Assume, Red ball = R, white ball = W, Box1 = A, Box2 = B
Probability for selected one as box1 P(A) =
1
2
Probability for selected box as box 2 P(B) =
1
2
Probability of getting red ball from box1 = P(R|A) =
3
5
Solution
Probability of getting red ball from box2 = P(R|B) =
5
9
probability Red ball was drawn from second box = P(B|R) =
𝑃(𝑅|𝐵).𝑃(𝐵)
𝑃(𝑅|𝐵).𝑃 𝐵 +𝑃(𝑅|𝐴).𝑃(𝐴)
This is called baye’s theorem
P(B|R) =
𝑃(𝑅|𝐵).𝑃(𝐵)
𝑃(𝑅|𝐵).𝑃 𝐵 +𝑃(𝑅|𝐴).𝑃(𝐴)
=
5
9
∗
1
2
5
9
∗
1
2
+
3
5
∗
1
2
= 0.487 = 48.7%
With below tabulation of the 100 people, what is the conditional probability that a
certain member of the school is a ‘Teacher’ given that he is a ‘Man’?
Example to Understand Baye’s Theorem
Female Male Total
Teacher 8 12 20
Student 32 48 80
Total 40 60 100
The Naïve Bayes Model
 The Bayes Rule provides the formula for the probability of Y given X. But, in real-
world problems, you typically have multiple X variables
 When the features are independent, we can extend the Bayes Rule to what is
called Naive Bayes
 It is called ‘Naive’ because of the naive assumption that the X’s are independent
of each other.
Naive Bayes Example
 Say you have 1000 fruits which could be either ‘banana’, ‘orange’ or ‘other’. These
are the 3 possible classes of the Y variable.
For the sake of computing the probabilities, let’s aggregate the
training data to form a counts table like this.
Step1: Compute the ‘Prior’ probabilities for each of the class of
fruits.
P(Y=Banana) = 500 / 1000 = 0.50
P(Y=Orange) = 300 / 1000 = 0.30
P(Y=Other) = 200 / 1000 = 0.20.
Step 2: Compute the probability of evidence that goes in the
denominator..
P(x1=Long) = 500 / 1000 = 0.50
P(x2=Sweet) = 650 / 1000 = 0.65
P(x3=Yellow) = 800 / 1000 = 0.80
Step 3: Compute the probability of likelihood of evidences that goes
in the numerator..
P(x1=Long | Y=Banana) = 400 / 500 = 0.80
P(x2=Sweet | Y=Banana) = 350 / 500 = 0.70
P(x3=Yellow | Y=Banana) = 450 / 500 = 0.90
So, the overall probability of Likelihood of evidence for Banana =
0.8 * 0.7 * 0.9 = 0.504
Step 4: Substitute all the 3 equations into the Naive Bayes formula,
to get the probability that it is a banana.
Nearest Neighbor Algorithm
 Simple Analogy , Tell me about your friends (Who your neighbors are) , then I
will tell who you are
Other Names for Nearest neighbor Algorithm
 K-Nearest Neighbors
 Memory-Based Reasoning
 Example-Based Reasoning
 Instance-Based Learning
 Lazy Learning
What is KNN (K-Nearest Neighbor)
 A powerful classification algorithm used in pattern recognition.
 K nearest neighbors stores all available casesand classifies new
cases based on a similarity measure(e.g distance function)
 One of the top data mining algorithms used today.
 A non-parametric lazy learning algorithm (An Instancebased Learning
method).
When do we use KNN
 KNN can be used for both classification and regression predictive problems.
However, it is more widely used in classification problems in the industry.
 To evaluate any technique we generally look at 3 important aspects
 It is commonly used for its easy of interpretation and low calculation time.
How does KNN work?
How does KNN work?.....
How does KNN work?.....
KNN has the following basic steps:
1. Calculate distance
2. Find closest neighbors
3. Vote for labels
Effect of K in KNN work
Effect of K in KNN work.....
How to choose factor K
Training error rate and Validation error rate
.
 Segregate the training and validation from the initial dataset. then
 Plot the validation error curve to get the optimal value of K. This value of K
should be used for all predictions
Distance Measure for Continuous Variables
 Minkowski =
Example
George to John Distance = Sqrt[(35 − 37)2+(35 − 50)2+(3 − 2)2] = 15.16
Rachel to John Distance = Sqrt[(22 − 37)2+(50 − 50)2+(2 − 2)2] = 15
Steve to John Distance = Sqrt[(63 − 37)2+(200 − 50)2+(1 − 2)2] = 152.23
Tom to John Distance = Sqrt[(59 − 37)2+(170 − 50)2+(1 − 2)2] = 122
Tom to John Distance = Sqrt[(25 − 37)2+(40 − 50)2+(4 − 2)2] = 15.74
Distance Measure from john to others using Euclidean
Distance
Linear Regression
In regression problem the goal of the algorithm is to predict real-valued output.
Types of Regression
1. Simple Linear Regression
2. Polynomial Regression
3. Support Vector Regression
4. Decision Tree Regression
5. Random Forest Regression
Form of Linear Regression
𝑌 = 𝑏0+𝑏1𝑋1 + 𝑏2𝑋2 + 𝑏3𝑋3 + ……….+ 𝑏𝑘𝑋𝑘
 Y is the response
 b values are called the model coefficients. These values are “learned”
during the model fitting/training step.
 𝑏0 is the intercept
 𝑏1 is the coefficient for X1 (the first feature)
 𝑏𝑘 is the coefficient for Xn (the nth feature)
Steps for Training Linear regression
1. Model Coefficients/Parameters
When training a linear regression model it’s way to say we are trying to find out a
coefficients for the linear function that best describe the input variables.
2. Cost Function (Loss Function)
When building a linear model it’s said that we are trying to minimize the error an
algorithm does making predictions, and we got that by choosing a function to help
us measure the error also called cost function.
3.Estimate The Coefficients
For that task there’s a mathematical algorithm called Gradient Descent,
Model evaluation metrics for regression
 It is necessary to evaluate metrics designed for comparing continuous values
 Root Mean Squared Error, is on of the best evaluation methods
1
𝑛
𝑖=1
𝑛
(𝑦𝑖 − 𝑦𝑚𝑒𝑎𝑛)2
Decision tree Algorithm
Example
Learn More About machine learning through Online Courses
1. Coursera – Machine Learning- Andrew N.G. – Stanford University
2. Machine Learning for Intelligent Systems – Kilian Weinberger
Machine Learning_PPT.pptx

More Related Content

Similar to Machine Learning_PPT.pptx

detailed Presentation on supervised learning
 detailed Presentation on supervised learning detailed Presentation on supervised learning
detailed Presentation on supervised learningZAMANCHBWN
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Lect 8 learning types (M.L.).pdf
Lect 8 learning types (M.L.).pdfLect 8 learning types (M.L.).pdf
Lect 8 learning types (M.L.).pdfHassanElalfy4
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)butest
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision TreesSara Hooker
 
Supervised learning and unsupervised learning
Supervised learning and unsupervised learningSupervised learning and unsupervised learning
Supervised learning and unsupervised learningArunakumariAkula1
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptxssuser6654de1
 
INTRODUCTION TO MACHINE LEARNING.pptx
INTRODUCTION TO MACHINE LEARNING.pptxINTRODUCTION TO MACHINE LEARNING.pptx
INTRODUCTION TO MACHINE LEARNING.pptxAbhigyanMishra17
 
An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine LearningVedaj Padman
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .pptbutest
 
Machine Learning Interview Questions
Machine Learning Interview QuestionsMachine Learning Interview Questions
Machine Learning Interview QuestionsRock Interview
 
Statistical foundations of ml
Statistical foundations of mlStatistical foundations of ml
Statistical foundations of mlVipul Kalamkar
 
Gradient Boosted trees
Gradient Boosted treesGradient Boosted trees
Gradient Boosted treesNihar Ranjan
 
Industrial training ppt
Industrial training pptIndustrial training ppt
Industrial training pptHRJEETSINGH
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
 

Similar to Machine Learning_PPT.pptx (20)

detailed Presentation on supervised learning
 detailed Presentation on supervised learning detailed Presentation on supervised learning
detailed Presentation on supervised learning
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Lect 8 learning types (M.L.).pdf
Lect 8 learning types (M.L.).pdfLect 8 learning types (M.L.).pdf
Lect 8 learning types (M.L.).pdf
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
 
Machine learning
Machine learningMachine learning
Machine learning
 
Supervised learning and unsupervised learning
Supervised learning and unsupervised learningSupervised learning and unsupervised learning
Supervised learning and unsupervised learning
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptx
 
INTRODUCTION TO MACHINE LEARNING.pptx
INTRODUCTION TO MACHINE LEARNING.pptxINTRODUCTION TO MACHINE LEARNING.pptx
INTRODUCTION TO MACHINE LEARNING.pptx
 
An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine Learning
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 
Unit-1.pdf
Unit-1.pdfUnit-1.pdf
Unit-1.pdf
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
Machine Learning Interview Questions
Machine Learning Interview QuestionsMachine Learning Interview Questions
Machine Learning Interview Questions
 
Statistical foundations of ml
Statistical foundations of mlStatistical foundations of ml
Statistical foundations of ml
 
dm1.pdf
dm1.pdfdm1.pdf
dm1.pdf
 
Gradient Boosted trees
Gradient Boosted treesGradient Boosted trees
Gradient Boosted trees
 
Industrial training ppt
Industrial training pptIndustrial training ppt
Industrial training ppt
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
ML_lec1.pdf
ML_lec1.pdfML_lec1.pdf
ML_lec1.pdf
 

Recently uploaded

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 

Machine Learning_PPT.pptx

  • 2. What is MACHINE LEARNING ?....  Not a well defined definition. But  Arthur Samuel (1959): Machine learning: "Field of study that gives computers the ability to learn without being explicitly programmed"  Samuel wrote a checkers playing program  Had the program play 10000 games against itself  Work out which board positions were good and bad depending on wins/losses Example
  • 3.  Tom Michel (1999): Well posed learning problem: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E." An other definition…… The checkers example,  E = 10000s games  T is playing checkers  P if you win or loss
  • 5. Is Machine Learning Magic? No, It is more like gardening. Seeds = Algorithms Nutrients = Data Gardener = You Plants = Programs
  • 6. Types of MACHINE LEARNING
  • 7. Supervised Learning (Train me)  It is a data mining task of inferring a function from labeled training data.  The training data consist of a set of training examples.  In supervised learning, each example is a pair consisting of an input object (typically a vector) and the desired output value (also called the supervisory signal) Unsupervised Learning (I am self sufficient in learning)  That learns from test data that has not been labeled, classified or categorized.  Instead of responding to feedback, unsupervised learning identifies commonalities in the data and reacts based on the presence or absence of such commonalities in each new piece of data
  • 8. Reinforcement Learning (My life My rules! (Hit & Trail))  It is the ability of an agent to interact with the environment and find out what is the best outcome. It follows the concept of hit and trial method.  The agent is rewarded with a point for a correct or a wrong answer, and on the basis of the positive reward points gained the model trains itself.  Reinforcement learning differs from the supervised learning in a way that in supervised learning, the training data has the answer key with it so the model is trained with the correct answer itself whereas in reinforcement learning, there is no answer but the reinforcement agent decides what to do to perform the given task  In the absence of training dataset, it is bound to learn from its experience.
  • 9. Supervised Unsupervised Learning Reinforcemen t • Labeled data • Direct feedback • Predict outcome/future • Decision Process • Reward System • Learn series of actions • No labels • No feedback • Find hidden Structure
  • 10. Supervised Learning Vs Unsupervised learning
  • 11. Real Life example Task is to arrange them as groups NO SIZE Colour Shape Fruit Name 1 Big Red Rounded shape with a depression at the top Apple 2 Small Red Heart-shaped to nearly globular Cherry 3 Big Green Long curving cylinder Banana 4 Small Green Round to oval, Bunch shape cylindrical Grape
  • 12. For Supervised Learning  Already learn from previous work about the physical characters of fruits  So arranging the same type of fruits at one place.  Your previous work is called as training data in data mining  You already learn the things from your train data, this is because of response variable  Response variable means just a decision variable
  • 13. For Unsupervised Learning  This time we don’t know any thing about the fruits, honestly saying this is the first time you have seen them. You have no clue about those.  So, how will we arrange them?  What will we do first???  We will take a fruit and you will arrange them by considering physical character of that particular fruit.
  • 14.  Suppose We have considered color •RED COLOR GROUP: apples & cherry fruits. •GREEN COLOR GROUP: bananas & grapes.  Consider The Size along with previous consideration: •RED COLOR AND BIG SIZE: apple. •RED COLOR AND SMALL SIZE: cherry fruits. •GREEN COLOR AND BIG SIZE: bananas. •GREEN COLOR AND SMALL SIZE: grapes.  This type of learning is known as unsupervised learning.  Clustering comes under unsupervised learning.
  • 15.
  • 16. Machine Learning Techniques MACHINE LEARNING SUPERVISED LEARNING UNSUPERVISED LEARNING CLASSIFICATION Nearest Neighbor SVR, GPR’ Decision Trees Neural Network REGRESSION Linear Regression GLM Ensemble Methods CLASSIFICATION Support Vector Machines Discriminant Analysis Naïve Bayes Hierarchical Neural Networks Hidden Markov Model CLUSTERING K-Means, K-Medoids, Fuzzy C-Means Gaussian Mixture
  • 17. Selecting the Right Algorithm  selecting a machine learning algorithm is a process of trial and error.  It’s also a trade-off between specific characteristics of the algorithms, such as:  Speed of training  Memory usage  Predictive accuracy on new data  Transparency or interpretability (how easily you can understand the reasons an algorithm makes its predictions)
  • 18. SUPERVISEDLEARNING Classification Regression Classification techniques predict discrete responses —for example, whether an email is genuine or spam, or whether a tumor is small, medium, or large. Classification models are trained to classify data into categories. Applications include medical imaging, speech recognition, and credit scoring. Regression techniques predict continuous responses —for example, changes in temperature or fluctuations in electricity demand. Applications include forecasting stock prices, handwriting recognition, and acoustic signal processing If the data can be separated into specific groups or classes, use classification algorithms. If the nature of your response is a real number –such as temperature, or the time until failure for a piece of equipment—use regression techniques.
  • 19.  Let’s take a closer look at the most commonly used classification and regression algorithms.
  • 20. Binaryvs. Multiclass Classification When we are working on a classification problem, begin by determining whether the problem is binary or multiclass. BinaryClassification Multiclass Classification A single training or test item (instance) can only be divided into two classes —for example, Determine whether an email is genuine or spam It can be divided into more than two —for example, Train a model to classify an image as a dog, cat, or other animal It requires a more complex model
  • 21. Binaryvs. Multiclass Classification 𝑋2 𝑋1 × × × × × × × ∆ ∆ ∆ ∆ ∆ 𝑋2 𝑋1 × × × × × × Binary Multiclass
  • 22. Otherexamplesfor Classification  Binary Classification  Put a tennis ball into the Color or no-Color bin (color)  (Medical Test) Determine if a patient has certain disease or not  (Quality Control Test) Decide if a product should be sold or discarded  (IR Test) Determine if a document should be in the search results or not  Multi-Class Classification  Put a tennis ball into the Green, Orange, or White ball bin (color)  Decide if an email is advertisement, newsletter, phishing, hack, or personal.  Classify a document into Yahoo! Categories  (Optical Recognition) Classify a scanned character into digit (0..9)
  • 23. Support Vector Machine  “Support Vector Machine” (SVM) is a supervised machine learning algorithm which can be used mostly in classification problems.  In this algorithm, data plots as a points in n- dimensional space (where n is number of features) with the value of each feature being the value of a particular coordinate.  Then, classification perform by finding the hyper-plane that differentiate the two classes very well hyper-plane
  • 24. Margin Margin HowSupport vector machine Works Classifies data by finding the linear decision boundary (hyperplane) that separates all data points of one class from those of the other class. The best hyperplane for an SVM is the one with the largest margin between the two classes, when the data is linearly separable. If the data is not linearly separable, a loss function is used to penalize points on the wrong side of the hyperplane.  SVMs sometimes use a kernel transform to transform nonlinearly separable data into higher dimensions where a linear decision boundary can be found.
  • 25. Identifythe right hyper-plane  “Select the hyper-plane which segregates the two classes better”.  In this scenario, hyper-plane “B” has excellently performed this job. Which is the right hyper plane? Which is the right hyper plane?  Above, you can see that the margin for hyper-plane C is high as compared to both A and B.  Hence, we name the right hyper-plane as C.
  • 27. Identifythe right hyper-plane  SVM selects the hyper-plane which classifies the classes accurately prior to maximizing margin.  Here, hyper-plane B has a classification error and A has classified all correctly.  Therefore, the right hyper-plane is A. Which is the right hyper plane? Which is the right hyper plane?  It solves this problem by introducing additional feature. Here, we will add a new feature z=x^2+y^2. (Kernel Transformation)  Now, let’s plot the data points on axis x and z:
  • 28. Support vector machine Best used …. For data that has exactly two classes (you can also use it for multiclass classification with a technique called error correcting output codes) For high-dimensional, nonlinearly separable data
  • 29. Pros and Cons associatedwithSVM Pros:  It works really well with clear margin of separation  It is effective in high dimensionalspaces.  It is effective in cases where number of dimensionsis greater than the number of samples.  It uses a subset of training points in the decisionfunction (called support vectors), so it is also memory efficient. Cons:  It doesn’t perform well, when we have large data set because the required training time is higher  It also doesn’tperform very well, when the data set has more noise i.e. target classes are overlapping  SVM doesn’t directly provide probabilityestimates, these are calculated using an expensive five-fold cross-validation.
  • 30. Discriminant Analysis Discriminant analysis (DA) is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor or independent variables are interval in nature.  It is a technique to discriminate between two or more mutually exclusive and exhaustive groups on the basis of some explanatory variables Types Discriminant Analysis (DA) 1. Linear D A - when the criterion / dependent variable has two categories Example: adopters & non-adopters 2. Multiple D A- when three or more categories are involved Example: SHG1, SHG2,SHG3
  • 31.  Group sizes of the dependent should not be grossly different i.e. 80:20. It should be at least five times the number of independent variables How DA Works Assumptions 1. Sample Size (n)  Each of the independent variable is normally distributed. 2. Normal Distribution  All variables have linear and homoscedastic relationships. 3. Homogeneity of variances / covariances
  • 32.  Outliers should not be present in the data. DA is highly sensitive to the inclusion of outliers. 4. Outliers  There should NOT BE MULTICOLLINEARITY among the independent variables. 5. Non - multicolinearity
  • 33.  The groups must be mutually exclusive, with every subject or case belonging to only one group. 6. Mutually exclusive  Each of the allocations for the dependent categories in the initial classification are correctly classified. 7. Classification
  • 34. Discriminant Analysis Model  The discriminant analysis model involves linear combinations of the following form 𝐷 = 𝑏0+𝑏1𝑋1 + 𝑏2𝑋2 + 𝑏3𝑋3 + ……….+ 𝑏𝑘𝑋𝑘  where  D = discriminant score  b 's = discriminant coefficient or weight  X 's = predictor or independent variable  The coefficients, or weights (b), are estimated so that the groups differ as much as possible on the values of the discriminant function.
  • 35. Applications of Discriminant Analysis Model  Discriminant analysis has been success fully used for many applications. As long as we can transform the problem into a classification problem.  DA can be used for original applications also 1. Identification  TO identify type of customers that is likely to buy certain product in a store.  Using simple questionnaires survey, we can get the features of customers  DA will help to select which features can describe the group membership of buy or not buy the product
  • 36. 3. Prediction  Question “will it rain to day” can be thought as prediction.  Prediction problem can be thought as assigning “today” to one of the two possible groups of rain and dry 2. Decision Making  Doctor diagnosing illness may be seen as which disease the patient has.  This problem can be transform into classification problem by assigning the patient to a number of possible groups of disease based on the Observation on the symptoms
  • 37. 5. Learning  Scientists want to teach robot to learn to talk can be seen as classification problem.  It assigns frequency , pitch, tune, and many other measurements of sound into many groups of words 4. Pattern recognition  To distinguish pedestrian from dogs and cars on capture image sequence of traffic date is a classification problem
  • 38. Naïve Bayes Model  It is a classification technique based on Bayes theorem with an assumption of independence among predictors.  It is easy to build and particularly useful for very large datasets.  It learns and predicts very fast and it does not require lots of storage.  I has an Assumption : All features must be independent of each other  It still returns very good accuracy in practice even when the independent assumption does not hold
  • 39. 1. Real-time Prediction Applications of Naïve Bayes Model 2. Multi - Class Prediction 3. Text Classification/ Spam Filtering/Sentiment Analysis 4. Recommendation System
  • 40. Probability Basics • Prior, conditional and joint probability for random variables – Prior probability: P(x) – Conditional probability: P(𝑥1|𝑥2), P(𝑥2|𝑥1) – Relationship: P 𝑥1, 𝑥2 = 𝑃 𝑥2 𝑥1 𝑃 𝑥1 = 𝑃 𝑥1 𝑥2 𝑃 𝑥2 – Independence: ) ( ) ( ) ), ( ) | ( ), ( ) | ( 2 1 2 1 2 1 2 1 2 x P x P ,x P(x x P x x P x P x x P 1    ) ( ) ( ) ( ) ( x x x P c P c | P | c P  Discriminative Generative Bayesian Rule
  • 41.
  • 42. Event contains 2 boxes. Box 1 Contains 2 white balls and 3 red balls, Box 2 contains 4 white balls and 5 red balls. One ball is drawn at random from one of the box and is found to be red. Find the probability that It was drawn from second box. Example to Understand Baye’s Theorem Let Assume, Red ball = R, white ball = W, Box1 = A, Box2 = B Probability for selected one as box1 P(A) = 1 2 Probability for selected box as box 2 P(B) = 1 2 Probability of getting red ball from box1 = P(R|A) = 3 5 Solution
  • 43. Probability of getting red ball from box2 = P(R|B) = 5 9 probability Red ball was drawn from second box = P(B|R) = 𝑃(𝑅|𝐵).𝑃(𝐵) 𝑃(𝑅|𝐵).𝑃 𝐵 +𝑃(𝑅|𝐴).𝑃(𝐴) This is called baye’s theorem P(B|R) = 𝑃(𝑅|𝐵).𝑃(𝐵) 𝑃(𝑅|𝐵).𝑃 𝐵 +𝑃(𝑅|𝐴).𝑃(𝐴) = 5 9 ∗ 1 2 5 9 ∗ 1 2 + 3 5 ∗ 1 2 = 0.487 = 48.7%
  • 44. With below tabulation of the 100 people, what is the conditional probability that a certain member of the school is a ‘Teacher’ given that he is a ‘Man’? Example to Understand Baye’s Theorem Female Male Total Teacher 8 12 20 Student 32 48 80 Total 40 60 100
  • 45. The Naïve Bayes Model  The Bayes Rule provides the formula for the probability of Y given X. But, in real- world problems, you typically have multiple X variables  When the features are independent, we can extend the Bayes Rule to what is called Naive Bayes  It is called ‘Naive’ because of the naive assumption that the X’s are independent of each other.
  • 46.
  • 47.
  • 48. Naive Bayes Example  Say you have 1000 fruits which could be either ‘banana’, ‘orange’ or ‘other’. These are the 3 possible classes of the Y variable.
  • 49. For the sake of computing the probabilities, let’s aggregate the training data to form a counts table like this.
  • 50. Step1: Compute the ‘Prior’ probabilities for each of the class of fruits. P(Y=Banana) = 500 / 1000 = 0.50 P(Y=Orange) = 300 / 1000 = 0.30 P(Y=Other) = 200 / 1000 = 0.20. Step 2: Compute the probability of evidence that goes in the denominator.. P(x1=Long) = 500 / 1000 = 0.50 P(x2=Sweet) = 650 / 1000 = 0.65 P(x3=Yellow) = 800 / 1000 = 0.80
  • 51. Step 3: Compute the probability of likelihood of evidences that goes in the numerator.. P(x1=Long | Y=Banana) = 400 / 500 = 0.80 P(x2=Sweet | Y=Banana) = 350 / 500 = 0.70 P(x3=Yellow | Y=Banana) = 450 / 500 = 0.90 So, the overall probability of Likelihood of evidence for Banana = 0.8 * 0.7 * 0.9 = 0.504
  • 52. Step 4: Substitute all the 3 equations into the Naive Bayes formula, to get the probability that it is a banana.
  • 53. Nearest Neighbor Algorithm  Simple Analogy , Tell me about your friends (Who your neighbors are) , then I will tell who you are
  • 54. Other Names for Nearest neighbor Algorithm  K-Nearest Neighbors  Memory-Based Reasoning  Example-Based Reasoning  Instance-Based Learning  Lazy Learning
  • 55. What is KNN (K-Nearest Neighbor)  A powerful classification algorithm used in pattern recognition.  K nearest neighbors stores all available casesand classifies new cases based on a similarity measure(e.g distance function)  One of the top data mining algorithms used today.  A non-parametric lazy learning algorithm (An Instancebased Learning method).
  • 56. When do we use KNN  KNN can be used for both classification and regression predictive problems. However, it is more widely used in classification problems in the industry.  To evaluate any technique we generally look at 3 important aspects  It is commonly used for its easy of interpretation and low calculation time.
  • 57. How does KNN work?
  • 58. How does KNN work?.....
  • 59. How does KNN work?..... KNN has the following basic steps: 1. Calculate distance 2. Find closest neighbors 3. Vote for labels
  • 60. Effect of K in KNN work
  • 61. Effect of K in KNN work.....
  • 62. How to choose factor K
  • 63. Training error rate and Validation error rate .  Segregate the training and validation from the initial dataset. then  Plot the validation error curve to get the optimal value of K. This value of K should be used for all predictions
  • 64. Distance Measure for Continuous Variables  Minkowski =
  • 66. George to John Distance = Sqrt[(35 − 37)2+(35 − 50)2+(3 − 2)2] = 15.16 Rachel to John Distance = Sqrt[(22 − 37)2+(50 − 50)2+(2 − 2)2] = 15 Steve to John Distance = Sqrt[(63 − 37)2+(200 − 50)2+(1 − 2)2] = 152.23 Tom to John Distance = Sqrt[(59 − 37)2+(170 − 50)2+(1 − 2)2] = 122 Tom to John Distance = Sqrt[(25 − 37)2+(40 − 50)2+(4 − 2)2] = 15.74 Distance Measure from john to others using Euclidean Distance
  • 67.
  • 68. Linear Regression In regression problem the goal of the algorithm is to predict real-valued output.
  • 69. Types of Regression 1. Simple Linear Regression 2. Polynomial Regression 3. Support Vector Regression 4. Decision Tree Regression 5. Random Forest Regression
  • 70. Form of Linear Regression 𝑌 = 𝑏0+𝑏1𝑋1 + 𝑏2𝑋2 + 𝑏3𝑋3 + ……….+ 𝑏𝑘𝑋𝑘  Y is the response  b values are called the model coefficients. These values are “learned” during the model fitting/training step.  𝑏0 is the intercept  𝑏1 is the coefficient for X1 (the first feature)  𝑏𝑘 is the coefficient for Xn (the nth feature)
  • 71. Steps for Training Linear regression 1. Model Coefficients/Parameters When training a linear regression model it’s way to say we are trying to find out a coefficients for the linear function that best describe the input variables. 2. Cost Function (Loss Function) When building a linear model it’s said that we are trying to minimize the error an algorithm does making predictions, and we got that by choosing a function to help us measure the error also called cost function. 3.Estimate The Coefficients For that task there’s a mathematical algorithm called Gradient Descent,
  • 72. Model evaluation metrics for regression  It is necessary to evaluate metrics designed for comparing continuous values  Root Mean Squared Error, is on of the best evaluation methods 1 𝑛 𝑖=1 𝑛 (𝑦𝑖 − 𝑦𝑚𝑒𝑎𝑛)2
  • 75.
  • 76. Learn More About machine learning through Online Courses 1. Coursera – Machine Learning- Andrew N.G. – Stanford University 2. Machine Learning for Intelligent Systems – Kilian Weinberger