L1 and L2 loss functions are used to minimize error during machine learning model training. The L1 loss function minimizes the sum of the absolute differences between true and predicted values, while the L2 loss function minimizes the sum of squared differences. These loss functions help the model adjust its parameters to reduce error via backpropagation. The L1 loss function is generally better when outliers are present in the data, as it is not as heavily influenced by outliers as the L2 loss function.
TOPICS
• Introduction toloss functions
• Types of loss functions
• Regression loss Function
• L1 and L2 loss function
3.
LOSS FUNCTIONS
Let’s sayI am on the top of a mountain and need to climb down. How do I
decide where to walk towards?
• Look around to see all the possible paths
• Reject the ones going up. This is because these paths would actually cost me
more energy and make my task even more difficult
• Finally, take the path that I think has the most slope downhill
This intuition that I just judged my decisions against? This is exactly what a loss
function provides.
A loss function maps decisions to their associated costs.
Deciding to go up the slope will cost us energy and time. Deciding to go down
will benefit us. Therefore, it has a negative cost.
4.
LOSS FUNCTIONS
A lossfunction is a function that compares the target and predicted output values and measures how well the neural
network models the training data. When training, we aim to minimize loss between the predicted and target outputs.
Basically a way to predict how good our decisions are to minimize the expected errors.
In supervised learning algorithms, we want to minimize the error for each training example during the learning
process. This is done using some optimization strategies like gradient descent.
In general Each training input is loaded into the neural network in a process called forward propagation. Once the
model has produced an output, this predicted output is compared against the given target output in a process
called back propagation
The hyper parameters of the model are then adjusted so that it then outputs a result closer to the target output. This
is where loss functions come in.
5.
TYPES OF LOSSFUNCTIONS
Classification Loss Functions( for discrete numeric variables)
1. Hinge Loss
2. Cross-Entropy Loss
Regression Loss Functions (for continuous numeric variables)
1. L1 Loss Function
2. L2 Loss Function
3. Huber Loss Function
6.
REGRESSION LOSS FUNCTIONS
Linearregression is a fundamental concept of this function.
Regression loss functions establish a linear relationship
between a dependent variable (Y) and an independent
variable (X); hence we try to fit the best line in space on
these variables.
X = Independent variables
Y = Dependent variable
7.
NON LINEAR REGRESSION
Innonlinear regression, the
experimental data are mapped to a
model, and mathematical function
representing variables (dependent and
independent) in a nonlinear relationship
that is curvilinear is formed and
optimized. It is accepted as a flexible
form of regression analysis regression
analysis.
8.
L1 LOSS FUNCTION(LEASTABSOLUTE DEVIATIONS.)
L1 Loss Function is used to minimize the error which is the sum of the all the absolute differences
between the true value and the predicted value.
9.
L2 LOSS FUNCTIONS(LEASTSQUARE ERRORS)
L2 Loss Function is used to minimize the error which is the sum of the all the squared differences
between the true value and the predicted value.
10.
HOW LOSS FUNCTIONWORKS
From the figure , If Y_pred is very far off from Y, the Loss
value will be very high. However if both values are almost
similar, the Loss value will be very low. Hence we need to
keep a loss function which can penalize a model
effectively while it is training on a dataset.
If the loss is very high, this huge value will propagate
through the network while it’s training and the weights will
be changed a little more than usual. If it’s small then the
weights won’t change that much since we consider the
network is already doing a good job.
11.
DECIDE BETWEEN L1AND L2 LOSS FUNCTION
When the outliers are present in the dataset, then the L2 Loss Function does not perform well. The reason
behind this bad performance is that if the dataset is having outliers, then because of the consideration of
the squared differences, it leads to the much larger error. Hence, L2 Loss Function is not useful there.
We tend to prefer L1 Loss Function as it is not affected by the outliers or remove the outliers and then we
L2 Loss Function can be used.