This document discusses machine learning algorithms for classification problems, specifically logistic regression. It explains that logistic regression predicts the probability of a binary outcome using a sigmoid function. Unlike linear regression, which is used for continuous outputs, logistic regression is used for classification problems where the output is discrete/categorical. It describes how logistic regression learns model parameters through gradient descent optimization of a likelihood function to minimize error. Regularization techniques can also be used to address overfitting issues that may arise from limited training data.
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand why Machine Learning came into picture, what is Machine Learning, types of Machine Learning, Machine Learning algorithms with a detailed explanation on linear regression, decision tree & support vector machine and at the end you will also see a use case implementation where we classify whether a recipe is of a cupcake or muffin using SVM algorithm. Machine learning is a core sub-area of artificial intelligence; it enables computers to get into a mode of self-learning without being explicitly programmed. When exposed to new data, these computer programs are enabled to learn, grow, change, and develop by themselves. So, to put simply, the iterative aspect of machine learning is the ability to adapt to new data independently. Now, let us get started with this Machine Learning presentation and understand what it is and why it matters.
Below topics are explained in this Machine Learning presentation:
1. Why Machine Learning?
2. What is Machine Learning?
3. Types of Machine Learning
4. Machine Learning Algorithms
- Linear Regression
- Decision Trees
- Support Vector Machine
5. Use case: Classify whether a recipe is of a cupcake or a muffin using SVM
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand why Machine Learning came into picture, what is Machine Learning, types of Machine Learning, Machine Learning algorithms with a detailed explanation on linear regression, decision tree & support vector machine and at the end you will also see a use case implementation where we classify whether a recipe is of a cupcake or muffin using SVM algorithm. Machine learning is a core sub-area of artificial intelligence; it enables computers to get into a mode of self-learning without being explicitly programmed. When exposed to new data, these computer programs are enabled to learn, grow, change, and develop by themselves. So, to put simply, the iterative aspect of machine learning is the ability to adapt to new data independently. Now, let us get started with this Machine Learning presentation and understand what it is and why it matters.
Below topics are explained in this Machine Learning presentation:
1. Why Machine Learning?
2. What is Machine Learning?
3. Types of Machine Learning
4. Machine Learning Algorithms
- Linear Regression
- Decision Trees
- Support Vector Machine
5. Use case: Classify whether a recipe is of a cupcake or a muffin using SVM
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...Bohdan Pavlyshenko
Linear, Machine Learning and Probabilistic models are often used in the predictive analytics. Each of them has its pros and cons for different industrial and business problems. Linear models make it possible to extrapolate forecasting, study impact of external factors but does not allow us to capture nonlinear complicated patterns in the data. Machine learning models can find a complicated pattern but only in the stationary data, at the same time these models require a lot of historical data for training to get sufficient accuracy. Probabilistic models based on the Bayesian inference can take into account expert opinion via prior distributions for parameters and can be used for different kinds of risk assessments. In the speech, I am going to consider the use of these models and their combinations in different use cases. One type of use case is numeric regression for time series forecasting, another one is logistic regression in manufacturing failure detection problems. I will also consider multilevel predictive ensembles of models based on the bagging and stacking approaches.
Application of Machine Learning in AgricultureAman Vasisht
With the growing trend of machine learning, it is needless to say how machine learning can help reap benefits in agriculture. It will be boon for the farmer welfare.
Understanding Blackbox Prediction via Influence FunctionsSEMINARGROOT
Pang Wei Koh and Percy Liang
"Understanding Black-Box prediction via influence functions" ICML 2017 Best paper
References:
https://youtu.be/0w9fLX_T6tY
https://arxiv.org/abs/1703.04730
This lecture covers Linear regression
Linear Regression is one of the most common, some 200 years old and most easily understandable in statistics and machine learning
it comes under predictive modelling.
Predictive modelling is a kind of modelling here the possible output(Y) for the given input(X) is predicted based on the previous data or values.
A widely used principle for fitting straight lines is the method of least squares by Gauss and Legendre
Find parameters for a model function that minimizes the error between values predicted by the model and those known from the training set
Covers supervised learning and discriminative algorithms. Includes: Linear Regression, The LMS Algorithm, Probabalistic interpretations, Classification, Logistic Regression, Underfitting and Overfitting.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...Bohdan Pavlyshenko
Linear, Machine Learning and Probabilistic models are often used in the predictive analytics. Each of them has its pros and cons for different industrial and business problems. Linear models make it possible to extrapolate forecasting, study impact of external factors but does not allow us to capture nonlinear complicated patterns in the data. Machine learning models can find a complicated pattern but only in the stationary data, at the same time these models require a lot of historical data for training to get sufficient accuracy. Probabilistic models based on the Bayesian inference can take into account expert opinion via prior distributions for parameters and can be used for different kinds of risk assessments. In the speech, I am going to consider the use of these models and their combinations in different use cases. One type of use case is numeric regression for time series forecasting, another one is logistic regression in manufacturing failure detection problems. I will also consider multilevel predictive ensembles of models based on the bagging and stacking approaches.
Application of Machine Learning in AgricultureAman Vasisht
With the growing trend of machine learning, it is needless to say how machine learning can help reap benefits in agriculture. It will be boon for the farmer welfare.
Understanding Blackbox Prediction via Influence FunctionsSEMINARGROOT
Pang Wei Koh and Percy Liang
"Understanding Black-Box prediction via influence functions" ICML 2017 Best paper
References:
https://youtu.be/0w9fLX_T6tY
https://arxiv.org/abs/1703.04730
This lecture covers Linear regression
Linear Regression is one of the most common, some 200 years old and most easily understandable in statistics and machine learning
it comes under predictive modelling.
Predictive modelling is a kind of modelling here the possible output(Y) for the given input(X) is predicted based on the previous data or values.
A widely used principle for fitting straight lines is the method of least squares by Gauss and Legendre
Find parameters for a model function that minimizes the error between values predicted by the model and those known from the training set
Covers supervised learning and discriminative algorithms. Includes: Linear Regression, The LMS Algorithm, Probabalistic interpretations, Classification, Logistic Regression, Underfitting and Overfitting.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
2. •Our goal is to estimate w from a training data
of <xi
,yi
> pairs
•One way to find such relationship is to
minimize the least squares error:
Linear regression
i
i
i
w − wx )2
arg min
∑
(
y
X
Y y = wx + ε
4. Marks of a student based on the number of
hours he/she put into the preparation
• Simple linear regression…..
• Multiple linear regression….
• Non-linear problem….
5. Marks of a student based on the number of
hours he/she put into the preparation
let’s assume
Marks of a student (M) do depend on the number
of hours (H) he/she put up for preparation.
The following formula can represent the model:
Marks = function (No. of hours)
=> Marks = m*Hours + c
6. Marks of a student based on the number of
hours he/she put into the preparation
let’s assume
Marks of a student (M) do depend on the number
of hours (H) he/she put up for preparation.
The following formula can represent the model:
Marks = function (No. of hours)
=> Marks = m*Hours + c
7. Marks of a student based on the number of
hours he/she put into the preparation
let’s Plot the data to check if it’s a Linear Problem
– The easiest way to determine
8. Marks of a student based on the number of
hours he/she put into the preparation
9. Marks of a student based on the number of
hours he/she put into the preparation
10. Marks of a student based on the number of
hours he/she put into the preparation
How to determine the Slope of line
The value of m….
11. Marks of a student based on the number of
hours he/she put into the preparation
• The value of m (slope of the line) can be
determined using an objective function which
is a combination of the loss function and a
regularization term.
• For simple linear regression, the objective
function would be the summation of Mean
Squared Error (MSE).
• The best fit line would be obtained
by minimizing the objective function
(summation of mean squared error).
12. Marks of a student based on the number of
hours he/she put into the preparation
• The value of m (slope of the line) can be
determined using an objective function which
is a combination of the loss function and a
regularization term.
• For simple linear regression, the objective
function would be the summation of Mean
Squared Error (MSE).
• The best fit line would be obtained
by minimizing the objective function
(summation of mean squared error).
13. Predicting weight reduction in form of the
number of KGs reduced
• Lets Assume
IT could depend upon input features such as:
age, height, the weight of the person, and the
time spent on exercises,
14. Predicting weight reduction in form of the
number of KGs reduced
Weight Reduction = Function(Age, Height,
Weight, Time On Exercise)
=> Shoe-size = b1*Height + b2*Weight +
b3*age + b4*time On Exercise + b0
15. Predicting weight reduction in form of the
number of KGs reduced
As part of training the above model
Goal:
Find the value of b1, b2, b3, b4, and b0 which would
minimize the objective function.
The objective function would be the summation of
mean squared error which is nothing but the sum of
the square of the actual value and the predicted
value for different values of age, height, weight,
and time On Exercise
16. Forecasting sales
Organizations often use linear regression models to
forecast future sales.
This can be helpful for things like budgeting and
planning.
Algorithms such as Amazon’s item-to-item
collaborative filtering are used to predict what
customers will buy in the future based on their past
purchase history
17. Cash forecasting
Many businesses use linear regression to forecast how
much cash they’ll have on hand in the future.
This is important for things like managing expenses
and ensuring that there is enough cash on hand to
cover unexpected costs.
18. Analyzing survey data
Linear regression can also be used to analyze survey
data.
This can help businesses understand things like
customer satisfaction and product preferences.
For example, a company might use linear regression
to figure out how likely people are to recommend their
product to others..
19. Stock predictions
A lot of businesses use linear regression models to
predict how stocks will perform in the future.
This is done by analyzing past data on stock prices
and trends to identify patterns.
20. Predicting consumer behavior
Businesses can use linear regression to predict things
like how much a customer is likely to spend.
Regression models can also be used to predict
consumer behavior. This can be helpful for things like
targeted marketing and product development.
For example, Walmart uses linear regression to predict
what products will be popular in different regions of
the country.
27. Regression for classification
• In some cases we can use linear regression for determining the
appropriate boundary.
• However, since the output is usually binary or discrete there are
more efficient regression methods
28. Regression for classification
• Assume we would like to use linear regression to learn the
parameters for p(y | X ; θ)
• Problems?
1
-1
Optimal regression
model
wT
X ≥ 0 ⇒ classify as 1
wT
X < 0 ⇒ classify as -1
29. Logistic Regression
Logistic Regression is basically a predictive model
analysis technique where the target variables (output)
are discrete values for a given set of features or input
(X).
For example whether someone is covid-19 positive
(1) or negative (0).
It is a very powerful yet simple classification
algorithm in machine learning borrowed from
statistics algorithms.
30. Logistic Regression
Logistic Regression is basically a predictive model
analysis technique where the target variables (output)
are discrete values for a given set of features or input
(X).
For example whether someone is covid-19 positive
(1) or negative (0).
It is a very powerful yet simple classification
algorithm in machine learning borrowed from
statistics algorithms.
Around 60% of the world’s classification problems
can be solved by using the logistic regression
algorithm.
31. Logistic Regression
Logistic regression is one of the most common
machine learning algorithms used for binary
classification.
It predicts the probability of occurrence of a binary
outcome.
Fraud detection, spam detection, cancer detection,
etc.
32. Sigmoid Function
It is a mathematical function having a characteristic
that can take any real value and map it to between 0
to 1 shaped like the letter “S”.
The sigmoid function also called a logistic function
g(h) =
1
1+ e− h
34. Sigmoid Function
If the value of z goes to positive infinity then the
predicted value of y will become 1 and if it goes to
negative infinity then the predicted value of y will
become 0.
And if the outcome of the sigmoid function is more
than 0.5 then we classify that label as class 1 or
positive class and if it is less than 0.5 then we can
classify it to negative class or label as class 0.
35. Diff b/w Linear & Logistic
Regression
Linear Regression is used when our dependent variable is
continuous in nature for example weight, height, numbers, etc.
and in contrast,
Logistic Regression is used when the dependent variable is
binary or limited for example: yes and no, true and false, 1 or 2,
etc.
In the 19th century, people use linear regression on biology to
predict health disease but it is very risky for example if a patient
has cancer and its probability of malignant is 0.4 then in linear
regression it will show that cancer is benign (because
probability comes <0.5). That’s where Logistic Regression
comes which only provides us with binary results.
36. Logistic regression vs. Linear regression
1
T
1+ ew X
p( y = 0 | X ;θ ) = g(wT
X )
=
T
e
w X
T
1+ ew X
p( y = 1| X ;θ ) = 1− g(wT
X )
=
37. Determining parameters for logistic
regression problems
∏
i
y
i i
i
i (1− y )
(1− g( X ; w)) g( X ; w)
L( y | X ; w)
=
• So how do we learn the
parameters?
• Similar to other regression problems
we look for the MLE for w
• The likelihood of the data given
the model is:
1
T
1+ ew X
p( y = 0 | X ;θ ) = g( X ; w)
= T
e
w X
T
1+ ew X
p( y = 1| X ;θ ) = 1− g( X ; w)
=
38. Gradient Descent
Gradient descent is an optimization algorithm
used to minimize some function by iteratively
moving in the direction of steepest descent as
defined by the negative of the gradient.
In machine learning, we use gradient descent to
update the parameters of our model.
39. Gradient Descent
Starting at the top of the mountain, we take our first
step downhill in the direction specified by the
negative gradient.
Next we recalculate the negative gradient (passing in
the coordinates of our new point) and take another
step in the direction it specifies.
We continue this process iteratively until we get to
the bottom of our graph, or to a point where we can
no longer move downhill–a local minimum.
41. Learning Rate
The size of these steps is called the learning rate.
With a high learning rate we can cover more ground each
step, but we risk overshooting the lowest point since the slope
of the hill is constantly changing.
With a very low learning rate, we can confidently move in
the direction of the negative gradient since we are
recalculating it so frequently.
A low learning rate is more precise, but calculating the
gradient is time-consuming, so it will take us a very long time
to get to the bottom.
42. Gradient ascent
Slope = ∂z/
∂w
z
Δw
w
•Going in the direction to the slope will lead to a larger z
•But not too much, otherwise we would go beyond the
optimal w
43. Gradient descent
z Slope = ∂z/
∂w
Δz
Δw
w
•Going in the opposite direction to the slope will lead to
a smaller z
•But not too much, otherwise we would go beyond the
optimal w
44. Finding The Best Weights -
Hill Descent
Ball on a complicated hilly terrain
— rolls down to a local valley
↑
this is called a local minimum
Questions:
How to get to the bottom of the deepest valley?
How to do this when we don’t have gravity?
45. Our Ein
Has Only One Valley
Weights, w
In-sample
Error,
E
in
. . . because Ein
(w) is a convex function of w.
46. Gradient Descent Method
Batch Gradient Descent: use all examples in each iteration
Mini batch Gradient Descent: use some examples in each iteration
Stochastic Gradient Descent: use 1 example in each iteration
47. Regularization
• Similar to other data estimation problems, we may not have enough
samples to learn good models for logistic regression classification
• One way to overcome this is to ‘regularize’ the model, impose
additional constraints on the parameters we are fitting.
• For example, lets assume that wj
comes from a Gaussian
distribution with mean 0 and variance σ2
(where σ2
is a user defined
parameter): wj
~N(0, σ2
)
• In that case we have a prior on the parameters and so:
p( y = 1,θ | X ) ∝ p( y = 1| X ;θ ) p(θ )
48. Credits
Yasir Abu Mustafa ,Caltech university, California
Barnabas Poczos, Ziv Bar-Joseph School of Computer Science,
Carnegie Mellon University
Vibhav Gogate The University of Texas at Dallas