This document provides an overview of linear model classification techniques. It begins with an introduction to linear basis function models and their use for regression and classification tasks. Key components of linear models, including basis functions, weights, and linear combinations, are described. Examples of specific linear models are then given, such as linear regression, logistic regression, and polynomial regression. Advantages and limitations of linear basis function models are also summarized.
Project where data sets of different drivers with different driving behavior were classified with linear regression and machine learning to train and test data.
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionIJECEIAES
Branch-and-Bound algorithm is the basis for the majority of solving methods in mixed integer linear programming. It has been proving its efficiency in different fields. In fact, it creates little by little a tree of nodes by adopting two strategies. These strategies are variable selection strategy and node selection strategy. In our previous work, we experienced a methodology of learning branch-and-bound strategies using regression-based support vector machine twice. That methodology allowed firstly to exploit information from previous executions of Branch-and-Bound algorithm on other instances. Secondly, it created information channel between node selection strategy and variable branching strategy. And thirdly, it gave good results in term of running time comparing to standard Branch-and-Bound algorithm. In this work, we will focus on increasing SVM performance by using cross validation coupled with model selection.
Iterative Determinant Method for Solving Eigenvalue Problemsijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Project where data sets of different drivers with different driving behavior were classified with linear regression and machine learning to train and test data.
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionIJECEIAES
Branch-and-Bound algorithm is the basis for the majority of solving methods in mixed integer linear programming. It has been proving its efficiency in different fields. In fact, it creates little by little a tree of nodes by adopting two strategies. These strategies are variable selection strategy and node selection strategy. In our previous work, we experienced a methodology of learning branch-and-bound strategies using regression-based support vector machine twice. That methodology allowed firstly to exploit information from previous executions of Branch-and-Bound algorithm on other instances. Secondly, it created information channel between node selection strategy and variable branching strategy. And thirdly, it gave good results in term of running time comparing to standard Branch-and-Bound algorithm. In this work, we will focus on increasing SVM performance by using cross validation coupled with model selection.
Iterative Determinant Method for Solving Eigenvalue Problemsijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Machine learning workshop, session 3.
- Data sets
- Machine Learning Algorithms
- Algorithms by Learning Style
- Algorithms by Similarity
- People to follow
Statistical theory is a branch of mathematics and statistics that provides the foundation for understanding and working with data, making inferences, and drawing conclusions from observed phenomena. It encompasses a wide range of concepts, principles, and techniques for analyzing and interpreting data in a systematic and rigorous manner. Statistical theory is fundamental to various fields, including science, social science, economics, engineering, and more.
Using Monte Carlo Simulation in Project Estimates by Akram Najjar
The PMI Lebanon is glad to announce that Akram Najjar is the speaker for the a lecture titled “Using Monte Carlo Simulation in Project Estimates” delivered on Thursday, 28 July 2016
Lecture Outline
* Why are single point estimates unreliable and what is the alternative?
* What are distributions and how do we extract random samples from them (using Excel)? Two costing examples.
* How to setup a Monte Carlo Simulation model in a spreadsheet?
* Two PM examples (in detail)
* How to statistically analyze the thousands of runs to reach reliable estimates?
Lecture Objectives
* A Project Manager usually knows how certain parameters (such as duration, resource rates or quantities) behave. However, the PM can almost never define reliable single point estimates for these parameters. The result: many projects fail due to unreliable estimates. The alternative? The PM has to use his/her knowledge of how specific parameters behave statistically. For example, the PM knows that a specific task’s duration is distributed according to the bell shaped curve OR that another is uniformly distributed (flat variation), or triangular, or Beta-PERT, etc. The PM can then use Monte Carlo Simulation (MCS) to arrive at statistically significant and robust results. Monte Carlo Simulation (MCS) is a technique that relies on two processes. Process 1 aims at developing a spreadsheet model that calculates the critical path or the total cost, etc. The calculation is setup in a single row (or Run). This row is then duplicated a large number of times (thousands). Process 2 aims at inserting Excel functions in each of the parameters (durations, costs). In each row (or Run), such functions will provide a sample drawn from a statistical distribution that properly describes the behavior of that parameter. For example, a specific duration follows a Normal (Bell) distribution with an Average A and a Standard Deviation S. The model will then generate for each run and for that duration a different value that conforms with the bell shaped curve as defined (A and S). Each of these thousands of runs will provide the PM with a different “simulation” of the duration or the total cost, etc. By statistically analyzing the thousands of results, the PM can arrive at a robust and reliable estimate. Proprietary Add On’s for Monte Carlo Simulation in Microsoft Project are available. However, it is easy, free and more flexible to use native Microsoft functions to carry out the full simulation. The talk covered all the steps needed for such simulations giving several examples
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Machine learning workshop, session 3.
- Data sets
- Machine Learning Algorithms
- Algorithms by Learning Style
- Algorithms by Similarity
- People to follow
Statistical theory is a branch of mathematics and statistics that provides the foundation for understanding and working with data, making inferences, and drawing conclusions from observed phenomena. It encompasses a wide range of concepts, principles, and techniques for analyzing and interpreting data in a systematic and rigorous manner. Statistical theory is fundamental to various fields, including science, social science, economics, engineering, and more.
Using Monte Carlo Simulation in Project Estimates by Akram Najjar
The PMI Lebanon is glad to announce that Akram Najjar is the speaker for the a lecture titled “Using Monte Carlo Simulation in Project Estimates” delivered on Thursday, 28 July 2016
Lecture Outline
* Why are single point estimates unreliable and what is the alternative?
* What are distributions and how do we extract random samples from them (using Excel)? Two costing examples.
* How to setup a Monte Carlo Simulation model in a spreadsheet?
* Two PM examples (in detail)
* How to statistically analyze the thousands of runs to reach reliable estimates?
Lecture Objectives
* A Project Manager usually knows how certain parameters (such as duration, resource rates or quantities) behave. However, the PM can almost never define reliable single point estimates for these parameters. The result: many projects fail due to unreliable estimates. The alternative? The PM has to use his/her knowledge of how specific parameters behave statistically. For example, the PM knows that a specific task’s duration is distributed according to the bell shaped curve OR that another is uniformly distributed (flat variation), or triangular, or Beta-PERT, etc. The PM can then use Monte Carlo Simulation (MCS) to arrive at statistically significant and robust results. Monte Carlo Simulation (MCS) is a technique that relies on two processes. Process 1 aims at developing a spreadsheet model that calculates the critical path or the total cost, etc. The calculation is setup in a single row (or Run). This row is then duplicated a large number of times (thousands). Process 2 aims at inserting Excel functions in each of the parameters (durations, costs). In each row (or Run), such functions will provide a sample drawn from a statistical distribution that properly describes the behavior of that parameter. For example, a specific duration follows a Normal (Bell) distribution with an Average A and a Standard Deviation S. The model will then generate for each run and for that duration a different value that conforms with the bell shaped curve as defined (A and S). Each of these thousands of runs will provide the PM with a different “simulation” of the duration or the total cost, etc. By statistically analyzing the thousands of results, the PM can arrive at a robust and reliable estimate. Proprietary Add On’s for Monte Carlo Simulation in Microsoft Project are available. However, it is easy, free and more flexible to use native Microsoft functions to carry out the full simulation. The talk covered all the steps needed for such simulations giving several examples
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
1. 8/31/2023 1
Module 2: Linear Model for Classification
By
Ms. Sarika Dharangaonkar
Assistant Professor,
KJSCE
2. 8/31/2023 2
Contents: Module2
2.1 Linear Basis Function Models
2.2 Bayesian Linear Regression
2.3 Discriminant Functions
2.4 Probabilistic Generative Models
2.5 Probabilistic Discriminative Models
3. 8/31/2023 3
Linear Basis Function Models
• Linear Basis Function Models, often referred to simply as linear models,
are a class of machine learning algorithms used for various tasks like
regression and classification.
• These models involve transforming the input data using a set of basis
functions to create a linear combination of features, which then forms the
foundation for making predictions or classifications.
4. 8/31/2023 4
Key Components of Linear Basis Function Model
• Basis Functions:
o Basis functions are functions that transform the input features into a new
representation.
o Common examples include polynomial basis functions (e.g., quadratic, cubic),
Gaussian radial basis functions, and Fourier basis functions.
o The choice of basis functions impacts how well the model can capture the
underlying patterns in the data.
• Weights and Parameters:
o Linear models include weights or coefficients associated with each basis
function.
o These weights determine the contribution of each basis function to the final
prediction.
o Adjusting these weights during the learning process is essential for the model to
fit the data effectively.
5. 8/31/2023 5
• Linear Combination:
o Linear models combine the basis functions and their associated weights in a linear
manner to produce the model's output.
o The transformed features are linearly weighted to generate predictions.
Key Components of Linear Basis Function Model
6. 8/31/2023 6
Types of Linear Basis Function Models
1. Linear Regression:
o Linear regression is a classic example of a linear basis function model.
o It uses a linear combination of input features (possibly transformed by basis
functions) to predict a continuous target variable.
2. Logistic Regression:
o Despite the name, logistic regression is a classification algorithm that uses linear
combinations of features (often transformed by sigmoid functions) to estimate the
probability of a binary outcome.
3. Ridge Regression (L2 Regularization):
o Ridge regression extends linear regression by adding a regularization term based
on the L2 norm of the weights.
o This helps prevent overfitting and can handle multicollinearity between features.
7. 8/31/2023 7
Types of Linear Basis Function Models…
4. Lasso Regression (L1 Regularization):
o Lasso regression is similar to ridge regression, but it adds an L1 regularization
term to the cost function.
o Lasso can also perform feature selection by driving some weights to exactly zero.
5. Polynomial Regression:
o Polynomial regression is a form of linear regression where the input features are
transformed using polynomial basis functions, allowing the model to capture non-
linear relationships.
8. 8/31/2023 8
Advantages of linear basis function models:
• Linear models are computationally efficient and can be trained on large
datasets.
• They provide interpretable results, as the impact of each feature can be directly
observed through the weights.
• Linear models can generalize well when the data exhibits linear or near-linear
relationships.
Limitations of linear basis function models:
• Linear models struggle with capturing complex non-linear relationships in the
data.
• The effectiveness of linear models highly depends on the choice of appropriate
basis functions.
• For high-dimensional datasets, linear models might not perform well due to
their simplicity.
9. 8/31/2023 9
Linear Regression
• The process of finding a straight line that best approximates a set of
points on graph is called Linear Regression
• The word Linear signifies that the type of relationship that you try to
establish between the variables tends to be a straight line
• It is simple and popular technique of regression analysis
10. 8/31/2023 10
Example:
• When you visit the a petrol pump, you have good estimate of how much
you would drive in the next few days and how much petrol you will need
correspondingly?
• Here how you are predicting your usage? It is based on your past data
points. That is nothing but the regression analysis
11. 8/31/2023 11
Example…
• When we try to establish a correlation between one variable (independent
variable) with another variable (dependent variable) and use the
relationship to predict the values.
• So, we can write:
o Petrol required (X)= 5litres
o Kilometres driven (Y)= 50km
o Then we can write the relationship as: Y= 10X
• Prediction: For 10litres of Petrol you can go up to 10x10=100kms
12. 8/31/2023 12
Types of Linear Regression
• Simple Linear Regression (SLR):
o It has only one Independent Variable.
o Example: No. of litres of petrol required and Kilometres driven.
• Multiple Linear Regression (MLR):
o It has more than one Independent(or Input) Variables.
o Example: Number of litres of petrol, age of vehicle, speed and kilometres
driven.
13. 8/31/2023 13
• General Formula for Linear Regression is:
Y= Bo + B1X1 + B2X2 + B3X3 + B4X4…….+ BiXi + E
Where,
o Y is the outcome(dependent variable)
o Xi are the values of independent variables
o Bo is the value of Y when each Xi=0. It is also called as Y intercept.
o Bi is the change in Y based on the unit change in Xi. It is called as regression
coefficient or slope of the regression line.
o E is the random error or noise that represents the difference between the
predicted value and actual values.
14. 8/31/2023 14
Linear Regression Algorithm Steps:
• Data Collection:
o Collect a dataset that includes the pairs of independent variable(s) (x) and
dependent variable (y) you want to model.
• Data Preprocessing:
o Clean the data, handle missing values, and remove outliers if necessary.
Preprocess the data to ensure it's ready for analysis.
• Data Splitting:
o Divide the dataset into training and testing sets. A common split is around 70-80%
of the data for training and the rest for testing.
• Model Initialization:
o Initialize the model by assuming an initial value for the slope (m) and the
intercept (c) of the linear equation: y = mx + c.
15. 8/31/2023 15
• Cost Function:
o Define a cost function that quantifies the difference between the predicted
values and the actual values in the training set.
o A commonly used cost function for linear regression is Mean Squared Error
(MSE).
• Gradient Descent:
o Use an optimization algorithm like gradient descent to minimize the cost
function.
o This involves iteratively updating the values of m and c to find the values that
minimize the cost function and best fit the data.
• Model Training:
o Iterate through the training dataset, calculate the predicted values using the
current values of m and c, compute the gradients of the cost function with
respect to m and c, and update the values of m and c using the gradients and a
learning rate.
16. 8/31/2023 16
• Making Predictions:
o After training, use the learned values of m and c to make predictions on new or
unseen data.
o Plug the independent variable(s) into the linear equation to get predicted values
of the dependent variable.
• Model Evaluation:
o Evaluate the performance of the trained model using the testing dataset.
o Common evaluation metrics include Mean Squared Error (MSE), Root Mean
Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared.
• Visualization:
o Visualize the regression line along with the data points to understand how well
the model fits the data.
17. 8/31/2023 17
Numerical on Linear Regression
• Suppose you are given a dataset that represents the relationship between
the number of hours studied and the corresponding exam scores. You
want to build a linear regression model to predict exam scores based on
the number of hours studied.
• Here's a small portion of the dataset:
18. 8/31/2023 18
Solution:
• To find the linear relationship between hours studied and exam scores
the formula can be given as: y = ax + b
Where:
y is the predicted exam score
x is the number of hours studied
m is the slope (coefficient) of the line
b is the y-intercept
Step 1: Calculate the means of x and y:
mean(x) = (2 + 3 + 4 + 5 + 6) / 5 = 4
mean(y) = (50 + 65 + 75 + 80 + 90) / 5 = 72
19. 8/31/2023 19
Step 2: Calculate the slope (a):
a = Σ((x - mean(x)) * (y - mean(y))) / Σ((x - mean(x))^2)
= ((2-4)*(50-72) + (3-4)*(65-72) + (4-4)*(75-72) + (5-4)*(80-72)
+ (6-4)*(90-72)) / ((2-4)^2 + (3-4)^2 + (4-4)^2 + (5-4)^2 + (6-
4)^2)
= 485 / 10
= 48.5
Step 3: Calculate the y-intercept (b):
b = mean(y) - m * mean(x) = 72 - 48.5 * 4 = -18
So, the equation for the linear regression line is:
y = 48.5x - 18
20. 8/31/2023 20
• Now you can use this equation to predict exam scores for different
values of hours studied.
• For example, if someone studies for 7 hours:
• y = 48.5 * 7 - 18 = 337.5
23. 8/31/2023 23
Logistic Regression
• Logistic Regression is classification algorithm
• It's used to predict the probability that an input belongs to a particular
class.
• Sometime, the output(the dependent variable) of the regression is
binary event having yes/no, pass/fail, 0/1 kind of results.
• In this case you need a regression model that find the probability of the
event occurring or not.
• Example:
o Given a particular humidity of the day, what are the chances of raining?
o Given certain credit score, what are the possibilities of getting your loan
application approved?
Logistic Regression provides a probability prediction model based on the
values of independent variables.
24. 8/31/2023 24
Logistic Regression
• Logistic regression is a regression technique used to model and estimate
the probability of an event to occur based on the values of the
independent variables.
• Unlike Linear regression line, the logistic regression line is a curve.
• This curve is called as Sigmoid curve or S-curve
• All the outcome data points are either concentrated on 0 or 1.
25. 8/31/2023 25
Sigmoid Curve of Logistic Regression
• The curve depicts various possible
probabilities of an event, occurring
between 0(totally uncertain) and
1(totally certain), on the y-axis.
26. 8/31/2023 26
Logistic Regression..
• The Probability prediction Formula for logistic regression is as
following:
• Where, p is predicted probability
• e is the exponential function
• Bo and B1 are regression coefficients
• X1 is the value of the independent variable
27. 8/31/2023 27
Numerical on Logistic Regression…
• A team scored 285 runs in a cricket match. Assuming regression
coefficients to be 0.3548 and 0.00089 respectively, calculate its
probability of winning the match.
28. 8/31/2023 28
Bayesian Linear Regression
• Bayesian Linear Regression is an extension of the traditional linear
regression that incorporates Bayesian principles to estimate the
parameters of the linear regression model.
• It combines prior knowledge (prior distribution) about the parameters
with observed data to obtain a posterior distribution, allowing for
uncertainty quantification and better inference about the model's
parameters.
• Bayesian Linear Regression provides a probabilistic framework for
modeling uncertainty in parameter estimates, making it particularly
useful when dealing with limited data or when you want to quantify
uncertainty in your predictions.
• It requires setting up appropriate prior distributions and often involves
more computation compared to traditional linear regression.
29. 8/31/2023 29
Bayesian Linear Regression…
Model Assumption:
• Similar to standard linear regression, the basic assumption is that the
relationship between the input features (predictors) and the target variable
(response) is linear.
Parameter Estimation in Standard Linear Regression:
• In traditional linear regression, parameter estimation involves finding the
values of the coefficients that minimize the sum of squared differences
between predicted and actual values (ordinary least squares).
• This estimation does not provide information about the uncertainty
associated with the parameter estimates.
30. 8/31/2023 30
Bayesian Linear Regression…
Bayesian Approach:
• Bayesian Linear Regression treats the coefficients as random variables
with probability distributions (prior distributions).
• Prior distribution reflects the initial belief or knowledge about the
parameter values before observing any data.
Bayes' Theorem:
• Bayesian inference involves using Bayes' theorem to update our beliefs
about the parameters after observing the data.
• Posterior distribution = (Likelihood * Prior) / Evidence, where:
o Likelihood: Measures how well the observed data matches the predictions made
by the model.
o Prior: Reflects prior knowledge or beliefs about the parameter values.
o Evidence: Acts as a normalization factor, ensuring that the posterior distribution is
a proper probability distribution.
31. 8/31/2023 31
Bayesian Linear Regression…
Parameter Estimation with Bayesian Linear Regression:
• The posterior distribution gives us a range of possible parameter values
along with their probabilities.
• The mean of the posterior distribution represents the point estimate of the
parameters, and the spread of the distribution represents the uncertainty
associated with the estimates.
Prediction and Uncertainty:
• Bayesian Linear Regression provides not only point predictions but also
predictive distributions for new data points.
• Prediction intervals or credible intervals from the posterior distribution
can be used to quantify prediction uncertainty.
32. 8/31/2023 32
Bayesian Linear Regression…
• Bayesian Linear Regression provides a probabilistic framework for
modeling uncertainty in parameter estimates, making it particularly
useful when dealing with limited data or when you want to quantify
uncertainty in your predictions.
• It requires setting up appropriate prior distributions and often involves
more computation compared to traditional linear regression.
33. 8/31/2023 33
Bayesian Linear Regression step by step
Step 1: Model Assumption
• Bayesian Linear Regression assumes that the relationship between the input
features ‘X’ and the target variable ‘y’ is linear. The standard linear regression
equation is given by:
y = β₀ + β₁*x₁ + β₂*x₂ + ... + βₙ*xₙ + ε
• Where:
• y is the target variable (response).
• x₁, x₂, ..., xₙ are the input features.
• β₀, β₁, ..., βₙ are the coefficients (parameters) we want to estimate.
• ε represents the error term.
34. 8/31/2023 34
Step 2: Introduction of Bayesian Principles
• In Bayesian Linear Regression, we treat the coefficients as random
variables with prior distributions that reflect our initial beliefs about
their values.
Step 3: Prior Distribution
• We introduce prior distributions for the coefficients to represent our
prior beliefs about their values. For simplicity, let's assume Gaussian
priors for each coefficient:
βᵢ ~ N(μ₀, σ₀²)
• Where:
• βᵢ is the i-th coefficient.
• N(μ₀, σ₀²) represents a Gaussian (normal) distribution with mean μ₀ and
variance σ₀².
35. 8/31/2023 35
Step 4: Likelihood Function
• The likelihood function quantifies how well the observed data fits the
model's predictions. Assuming the errors (ε) are normally distributed
with mean 0 and variance σ², the likelihood can be written as:
p(y | X, β) = N(Xβ, σ²I)
Where:
• y is the observed target variable.
• X is the matrix of input features.
• β is the vector of coefficients.
• σ² is the error variance.
• I is the identity matrix.
36. 8/31/2023 36
Step 5: Posterior Distribution
• Using Bayes' theorem, the posterior distribution is proportional to the
product of the prior and likelihood:
p(β | X, y) ∝ p(y | X, β) * p(β)
Step 6: Compute the Posterior
• For simplicity, let's assume a conjugate prior (a prior that leads to a
posterior in the same distribution family) and choose a Gaussian prior
for β. The posterior distribution for each coefficient is also Gaussian:
βᵢ | X, y ~ N(μᵢ, σᵢ²)
Where:
• μᵢ = (σ²₀ * XᵀX + σ² * I)⁻¹ * (σ²₀ * μ₀ + Xᵀy)
• σᵢ² = (σ²₀ * XᵀX + σ² * I)⁻¹
37. 8/31/2023 37
Step 7: Make Predictions
• To make predictions for new data, use the estimated posterior
distributions of the coefficients. The predictive distribution for a new
observation x_new is:
p(y_new | x_new, X, y) = N(μ_new, σ²_new + σ²)
Where:
• μ_new = x_newᵀμ
• σ²_new = x_newᵀσ²X(x_newᵀX + σ²I)⁻¹x_new
Step 8: Hyperparameters and Inference
Hyperparameters like μ₀, σ₀², and σ² control the shape of the prior
distributions and the amount of regularization. These can be chosen based
on domain knowledge or cross-validation.
38. 8/31/2023 38
Discriminant Function
• A discriminant is a function that takes an input vector x and assigns it to one
of K classes, denoted Ck.
• The simplest representation of a linear discriminant function is obtained by
taking a linear function of the input vector so that
• where w is called a weight vector, and w0 is a bias
• The negative of the bias is sometimes called a threshold.
• An input vector x is assigned to class C1 if y(x) 0 and to class C2 otherwise.
39. 8/31/2023 39
Discriminant Function…
feature 2
feature 1
Illustration of the geometry of a
linear discriminant function in two dimensions.
The decision surface, shown in red, is perpendicular
to w, and its displacement from the
origin is controlled by the bias parameter w0.
Also, the signed orthogonal distance of a general
point x from the decision surface is given
by y(x)/w.
The normal distance from the origin to the decision surface is
given by
• w determines orientation of
decision boundary
• ω0 determines location of
decision boundary
40. 8/31/2023 40
Discriminant functions - How to determine
parameters?
1. Least Squares for Classification
• General Principle: Minimize the squared distance (residual) between the observed
data point and its prediction by a model function
41. 8/31/2023 41
Discriminant functions - How to determine
parameters?
• In the context of classification: find the parameters which minimize the squared
distance (residual) between the data points and the decision boundary
42. 8/31/2023 42
Discriminant functions - How to determine
parameters?
• Problem: sensitive to outliers; also distance between the outliers and the
discriminant function is minimized --> can shift function in a way that leads to
misclassifications
least squares
logistic
regression
43. 8/31/2023 43
Discriminant functions - How to determine
parameters?
2. Fisher’s Linear Discriminant
• General Principle: Maximize distance between means of different classes while
minimizing the variance within each class
maximizing
between-class
variance
maximizing between-class
variance & minimizing within-
class variance
44. 8/31/2023 44
Discriminant Function
• A discriminant function focuses on
finding a decision boundary that
separates different classes in the feature
space.
• In a two-dimensional feature space, you
can visualize this as a line or curve that
separates the data points belonging to
different classes.
• For example, in a binary classification
scenario, you might have two classes,
and the discriminant function helps
determine whether a new data point
should be classified as one class or the
other based on which side of the
decision boundary it falls.
45. 8/31/2023 45
Probabilistic Generative Models
• Model class-conditional densities (p(x⎮Ck)) and class priors (p(Ck))
• Use them to compute posterior class probabilities (p(Ck⎮x)) according to Bayes
theorem
• Posterior probabilities can be described as logistic sigmoid function
Inverse of sigmoid function is the logit
function which represents the
ratio of the posterior probabilities for the
two classes
ln[p(C1⎮x)/p(C2⎮x)] --> log odds
46. 8/31/2023 46
Probabilistic Generative Model
Gaussian Mixture Models (GMMs):
• Gaussian Mixture Models assume that the data comes from a mixture of
several Gaussian distributions. The model involves estimating the
parameters of these Gaussians.
• Equation for Gaussian distribution:
o p(x|μ, σ²) = (1 / √(2πσ²)) * exp(-(x - μ)² / (2σ²))
o In the case of GMM, the data point x is generated from one of the K Gaussian
components with probability π_k:
o p(x) = Σ[π_k * p(x|μ_k, σ²_k)] for k = 1 to K
• Parameters to be learned:
• μ_k: Mean of the k-th Gaussian component.
• σ²_k: Variance of the k-th Gaussian component.
• π_k: Mixing coefficient representing the probability of choosing the k-
th component.
47. 8/31/2023 47
Probabilistic Generative Model
Naive Bayes Classifier:
• Naive Bayes Classifier is a generative model that assumes
independence among features given the label. The model calculates the
posterior probability of the label given the features using Bayes'
theorem:
• P(y|x₁, x₂, ..., x_n) ∝ P(y) * P(x₁|y) * P(x₂|y) * ... * P(x_n|y)
• Parameters to be learned:
• P(y): Prior probability of each label.
• P(xᵢ|y): Probability of observing feature xᵢ given label y.
48. 8/31/2023 48
Probabilistic Generative Model:
• A probabilistic generative model aims to
model the distribution of data for each
class separately and then uses these
distributions to generate new data points.
• In a two-class scenario, you could
visualize this as two overlapping
distributions (e.g., Gaussian distributions)
for each class, indicating the probability
density for different feature values.
• The overlap of these distributions
determines the region where the model
might be uncertain about the class.
49. 8/31/2023 49
Probabilistic Discriminative Models
• Probabilistic Discriminative Models are a class of machine learning
models that focus on directly modeling the conditional probability of
the target variable (output) given the input features.
• These models are designed to classify data points into different classes
based on the learned relationship between the features and the target
variable.
• Unlike generative models, which aim to learn the full data distribution,
discriminative models are primarily concerned with making accurate
predictions and decisions.
50. 8/31/2023 50
Probabilistic Discriminative Models
Types of Probabilistic Discriminative Models:
Several types of probabilistic discriminative models exist:
• Logistic Regression:
o Models the probability of a binary outcome using the logistic function. It's a
fundamental discriminative model for binary classification.
• Linear Discriminant Analysis (LDA):
o Assumes that the data follows a Gaussian distribution and models the decision boundary
using a linear combination of input features.
• Support Vector Machines (SVMs):
o Maximizes the margin between different classes by finding the hyperplane that best
separates the data.
• Neural Networks:
o Can be designed as discriminative models by using appropriate activation functions in
the output layer and optimizing them using techniques like backpropagation.
51. 8/31/2023 51
Probabilistic Discriminative Model:
• A probabilistic discriminative model
focuses on modeling the probability of each
class given the input features.
• In a two-dimensional feature space, this
could be represented as contours that
indicate the probability of a data point
belonging to a certain class.
• The decision boundary is where these
contours intersect, indicating a threshold at
which the model makes a decision about the
class.
52. 8/31/2023 52
Aspect Probabilistic Generative Models Probabilistic Discriminative Models
Main Focus Model joint distribution of features and classes Model conditional distribution of classes given features
Goal Provide a complete probabilistic description of data Learn decision boundaries for classification tasks
Usage
Generate new samples, impute missing data, anomaly
detection
Classification tasks with strong performance
Performance vs. Classification
May not achieve the same classification performance as
discriminative models
Often provide better classification performance
Examples
Gaussian Mixture Models (GMMs), Naive Bayes, Hidden
Markov Models (HMMs)
Logistic Regression, Support Vector Machines (SVMs),
Conditional Random Fields (CRFs), Neural Networks
Data Requirement Require a relatively larger amount of data Can perform well even with smaller datasets
Computational Complexity
Generally more complex due to modeling joint
distribution
Often computationally more efficient
Data Generation vs. Classification Can generate new samples similar to training data Focused on classifying new data points
Task Emphasis Emphasis on understanding data generation process Emphasis on accurate classification
53. 8/31/2023 53
Aspect Discriminant Function Probabilistic Generative Models Probabilistic Discriminative Models
Main Purpose Compute decision boundary directly Model joint distribution of data
Model conditional distribution of
classes
Output Scalar value indicating the class Probabilities for classes and data Probabilities for classes given data
Example
Linear discriminant function (LDA),
Support Vector Machines (SVMs)
Gaussian Mixture Models (GMMs),
Naive Bayes
Logistic Regression, Neural Networks
Training
Learns decision boundary directly
from data
Models data generation process
Models decision boundary for
classification
Data Generation Does not generate new data Can generate new samples Does not generate new data
Probability Estimation
Does not provide explicit probability
estimates
Provides explicit class and data
probabilities
Provides class probabilities given data
Use Case
Binary classification, multi-class
classification
Image generation, data augmentation
Image classification, sentiment
analysis
Performance
Good for simple data distributions,
can work well with limited data
May not be optimal for complex
distributions, but provides insights
into data generation process
Often achieves strong classification
performance
Example
Given a new data point, computes a
score that determines the class
Given data and classes, models how
the data is generated
Given data, models how classes are
distributed
Examples
Linear Discriminant Analysis (LDA),
Support Vector Machines (SVMs)
Gaussian Mixture Models (GMMs),
Naive Bayes
Logistic Regression, Neural Networks
54. 8/31/2023 54
Sr.No Linear Regression Logistic Regression
1
Linear regression is used to predict the continuous dependent
variable using a given set of independent variables.
Logistic regression is used to predict the categorical
dependent variable using a given set of independent variables.
2 Linear regression is used for solving Regression problem. It is used for solving classification problems.
3 In this we predict the value of continuous variables In this we predict values of categorical varibles
4 In this we find best fit line.(Linear) In this we find Sigmoid-Curve .
5
Least square estimation method is used for estimation of
accuracy.
Maximum likelihood estimation method is used for
Estimation of accuracy.
6 The output must be continuous value,such as price,age,etc.
Output is must be categorical value such as 0 or 1, Yes or no,
etc.
7
It required linear relationship between dependent and
independent variables.
It not required linear relationship.
8 There may be collinearity between the independent variables.
There should not be collinearity between independent
variable.