SlideShare a Scribd company logo
1 of 40
This course is prepared under the Erasmus+ KA-210-YOU Project titled
«Skilling Youth for the Next Generation Air Transport Management»
Machine Learning
Applications in Aviation
Regression Methods
Asst. Prof. Dr. Emircan Özdemir
Eskişehir Technical University
• Regression Methods play a crucial role in predicting numerical outcomes
in the context of machine learning.
• In aviation industry, forecasting fuel consumption, optimizing flight paths,
predicting maintenance needs, anticipating passenger demand for a
particular route, optimizing marketing strategies for increased ticket sales,
or predicting customer satisfaction levels, regression methods offer a
versatile toolkit for modeling intricate relationships between variables in
the aviation domain.
Regression Methods 2
Introduction
• Simple Linear Regression
Simple Linear Regression is a foundational regression method used in machine learning
and statistics to model the relationship between two variables: one independent variable
(predictor) and one dependent variable (outcome). The goal is to establish a linear
equation that best fits the observed data points, allowing for predictions or estimations of
the dependent variable based on changes in the independent variable.
The equation of a simple linear regression line is typically expressed as:
Regression Methods 3
Types of Regression Problems
• Simple Linear Regression
The regression analysis aims to find the values of b0 and b1 that minimize the difference
between the observed and predicted values, often using methods like the least squares
approach.
Regression Methods 4
Types of Regression Problems
Source: https://medium.com/@sachin.hs20/simple-linear-regression-using-example-e4e2a89df54c
• Simple Linear Regression
In aviation, this model could help predict fuel consumption based on the number of
passengers, offering insights into how changes in passenger load may impact fuel
efficiency. This, in turn, aids in optimizing flight operations, resource planning, and cost
management.
Regression Methods 5
Types of Regression Problems
• Multiple Linear Regression
Multiple Linear Regression is an extension of Simple Linear Regression, allowing for the
modeling of the relationship between a dependent variable and multiple independent
variables. In this method, the goal is to create a linear equation that best fits the observed
data points by considering the impact of multiple predictors on the outcome variable.
The general form of the Multiple Linear Regression equation is:
Regression Methods 6
Types of Regression Problems
• Multiple Linear Regression
In the context of aviation, consider a scenario where the fuel consumption of an aircraft
(dependent variable) is influenced not only by the number of passengers (as in Simple
Linear Regression) but also by additional factors such as flight distance, weather conditions,
and aircraft type.
Multiple Linear Regression enables the creation of a model that incorporates all these
variables to provide a more comprehensive understanding of how they collectively influence
fuel consumption.
Regression Methods 7
Types of Regression Problems
Source: https://www.mathworks.com/help/stats/regress.html
• Polynomial Regression
Polynomial Regression is a type of regression analysis that extends the concept of linear
regression by introducing polynomial terms, allowing for the modeling of non-linear
relationships between the dependent and independent variables. In other words, it
accommodates situations where the relationship between the variables cannot be
adequately represented by a straight line.
Regression Methods 8
Types of Regression Problems
• Polynomial Regression
The Polynomial Regression equation takes the form:
Regression Methods 9
Types of Regression Problems
• Polynomial Regression
The choice of the degree of the polynomial (n) depends on the complexity of the
relationship in the data. A higher-degree polynomial can fit the data more closely but may
risk overfitting, capturing noise rather than true patterns.
In the context of aviation, Polynomial Regression can be applied when predicting variables
with non-linear trends.
For example, forecasting the relationship between aircraft altitude and fuel efficiency might
involve a polynomial model, as the impact of altitude on fuel efficiency may not follow a
straight line. Polynomial Regression allows for a more flexible representation of such
relationships, enhancing the accuracy of predictions in scenarios where non-linear patterns
are evident.
Regression Methods 10
Types of Regression Problems
• Linear Regression
It serves as the foundational algorithm for modeling linear relationships
between variables.
This method is particularly useful in scenarios where a clear, straight-line
relationship can be established, such as predicting fuel consumption based
on flight time.
Regression Methods 11
Popular Regression Algorithms
• Ridge Regression
It introduces regularization techniques to improve model performance.
In aviation, where datasets often exhibit multicollinearity among variables,
Ridge Regression's regularization helps mitigate overfitting and enhances
the model's generalizability, contributing to more accurate predictions.
The aim of ridge regression is to minimize the sum of squared residuals (like
ordinary linear regression), but with an additional penalty for the sum of
squared coefficients (L2 regularization term).
It helps prevent overfitting, particularly when there is multicollinearity among
predictor variables.
Regression Methods 12
Popular Regression Algorithms
• Lasso Regression
Lasso Regression, on the other hand, brings in L1 regularization, offering a
unique advantage in feature selection.
This is crucial in aviation scenarios where selecting the most relevant
features can significantly impact the accuracy and efficiency of regression
models. Lasso Regression aids in identifying and prioritizing the key
variables that influence outcomes, such as predicting aircraft maintenance
needs.
Its aim is similar to Ridge Regression, but with an L1 regularization term,
which penalizes the absolute values of the coefficients.
It promotes sparsity in the coefficient values, effectively performing
automatic feature selection by pushing certain coefficients to exactly zero.
Regression Methods 13
Popular Regression Algorithms
• Elastic Net Regression
Elastic Net Regression is a hybrid regularization technique that combines
aspects of both Ridge Regression and Lasso Regression. It incorporates
both L1 (Lasso) and L2 (Ridge) regularization terms into the linear
regression objective function.
It aims to minimize the sum of squared residuals, like ordinary linear
regression, including both L1 and L2 regularization terms to penalize the
sum of absolute values (L1) and the sum of squared values (L2) of the
coefficients.
It’s suitable for situations where there are many correlated features, and
automatic feature selection is desired.
Regression Methods 14
Popular Regression Algorithms
• Building the right model is important to avoid overfitting and underfitting. In
In some cases, Ridge, Lasso and Elastic Net regression algorithms help
us to make better long term predictions and avoid over/under fitting.
Regression Methods 15
Overfitting and Underfitting
Source: https://www.geeksforgeeks.org/lasso-vs-ridge-vs-elastic-net-ml/
• Mean Squared Error (MSE)
MSE is a regression evaluation metric that quantifies the average squared difference
between predicted values and actual values in the dataset.
A lower MSE indicates better model performance, with smaller errors between predicted
and actual values. It provides a measure of the accuracy of the model's predictions.
Regression Methods 16
Regression Evaluation Metrics
• Root Mean Squared Error (RMSE)
The RMSE is a commonly used metric for evaluating the accuracy of a regression model. It
represents the square root of the average squared differences between the actual and
predicted values.
The lower the RMSE, the better the model's performance, as it indicates smaller prediction
errors.
Regression Methods 17
Regression Evaluation Metrics
• Mean Absolute Error (MAE)
MAE is another metric for evaluating the accuracy of a regression model. It measures the
average absolute differences between the actual and predicted values.
Unlike the squared differences in RMSE, MAE provides a more straightforward
interpretation of the average prediction error.
The lower the MAE, the better the model's performance in terms of absolute prediction
accuracy.
Regression Methods 18
Regression Evaluation Metrics
• R-squared (𝑹𝟐)
R-squared is a statistical metric that represents the proportion of the variance in the
dependent variable that is predictable from the independent variables. It ranges from 0 to 1,
where 1 indicates a perfect fit.
A higher R-squared value signifies a better fit of the regression model to the data. It
quantifies the proportion of variability in the dependent variable that can be explained by the
independent variables. However, caution is needed, as high R-squared does not
necessarily imply causation.
Regression Methods 19
Regression Evaluation Metrics
Conclusion:
• The lower value of MAE, MSE, and RMSE implies higher accuracy of a
regression model.
• However, a higher value of R-squared is considered desirable.
Regression Methods 20
Regression Evaluation Metrics
• Predictive Maintenance
Regression is employed to forecast maintenance needs based on historical data. By
analyzing patterns and trends in past maintenance records, aviation companies can
proactively schedule maintenance activities, reducing the risk of unplanned downtime and
improving overall operational efficiency.
• Predictors: Historical maintenance records, Sensor data from aircraft components, Flight
hours and cycles
• Outcome(s): Predicted time until the next maintenance event, Identification of specific
components requiring attention
Regression Methods 21
Regression Use Cases in Aviation
• Fuel Consumption Optimization
Regression models play a crucial role in optimizing fuel efficiency in aviation. By considering
factors such as aircraft type, weather conditions, and flight routes, regression helps in
developing models that guide fuel consumption strategies. This contributes to cost savings
and environmental sustainability.
• Predictors: Aircraft type and mode, Weather conditions, Flight route and altitude
• Outcome(s): Optimal fuel consumption rate, Cost-effective fuel usage strategies
Regression Methods 22
Regression Use Cases in Aviation
• Flight Time Prediction
Regression is utilized for estimating flight durations and enhancing scheduling processes.
By analyzing various variables such as departure and arrival locations, historical data, and
potential delays, regression models assist in predicting more accurate flight times,
improving overall airline scheduling efficiency.
• Predictors: Departure and arrival locations, Historical flight data Weather forecasts
• Outcome(s): Predicted flight duration, More accurate scheduling of arrival and departure
times
Regression Methods 23
Regression Use Cases in Aviation
• Passenger Loyalty Prediction
Airlines can leverage regression methods to analyze historical data related to passenger
behavior, including factors like booking frequency, travel preferences, and demographic
information. By building regression models, airlines can predict the likelihood of passengers
remaining loyal to their services. This allows for targeted strategies to enhance passenger
experience, offer personalized incentives, and ultimately foster stronger customer loyalty in
the highly competitive aviation industry.
• Predictors: Historical passenger behavior, Booking frequency, Travel preferences and
patterns, Demographic information
• Outcome(s): Predicted likelihood of a passenger choosing the airline for future travels,
Identification of factors influencing passenger loyalty
Regression Methods 24
Regression Use Cases in Aviation
• Feature Selection
The selection of features plays a pivotal role. Features are the variables
used to predict the outcome, and choosing relevant ones is crucial for
accurate regression models.
Aviation datasets may contain numerous variables, and careful
consideration is needed to identify which features significantly contribute to
the predictive power of the model.
Feature selection ensures that the chosen variables align with the specific
objectives of the regression task, enhancing the model's efficacy and
interpretability.
Regression Methods 25
Considerations and Challenges
• Multicollinearity
Multicollinearity refers to the presence of correlated predictor variables in a
regression model. In aviation datasets, certain features may exhibit high
correlation, potentially posing challenges.
When predictor variables are highly correlated, it becomes difficult to
distinguish the individual impact of each variable on the outcome.
Addressing multicollinearity is essential for maintaining the stability and
reliability of regression models.
Techniques such as variance inflation factor (VIF) analysis can be employed
to identify and mitigate multicollinearity, ensuring the robustness of the
regression analysis in aviation applications.
Regression Methods 26
Considerations and Challenges
• Data Preprocessing
Data preprocessing is a critical step in aviation regression, emphasizing the
importance of cleaning and preparing data before applying regression
models.
Aviation datasets can be complex, containing various variables with missing
values, outliers, or inconsistencies. Data preprocessing techniques involve
handling missing data, addressing outliers, and transforming variables to
ensure they align with the assumptions of regression models.
By cleaning and preparing the data meticulously, analysts create a robust
foundation for regression analysis, enhancing the accuracy and reliability of
the subsequent modeling.
Regression Methods 27
Best Practices
• Model Selection
Model selection is a key facet of successful aviation regression. Different
regression algorithms may be suited to varying aviation problems, and
selecting the right algorithm is crucial for achieving accurate and meaningful
results.
Considerations such as the nature of the data, the relationship between
variables, and the specific goals of the regression task guide the choice of
regression algorithm. Whether it's linear regression, ridge regression, or
other advanced techniques, thoughtful model selection ensures that the
regression model aligns optimally with the characteristics of the aviation
dataset and the objectives of the analysis.
Regression Methods 28
Best Practices
• Ensemble Regression Models
Ensemble regression models represent a future trend in aviation that
involves the exploration of combining multiple regression models for
enhanced accuracy and robust predictions.
Ensemble methods, such as Random Forest or Gradient Boosting, leverage
the strength of diverse models to mitigate individual weaknesses and
improve overall performance. By aggregating predictions from multiple
models, ensemble techniques offer a sophisticated approach to handle
complex relationships within aviation datasets, contributing to more reliable
regression outcomes.
Regression Methods 29
Future Trends
• Explainable AI in Regression:
The evolving trend of Explainable AI in regression underscores the
increasing importance of transparency and interpretability in regression
models. As regression techniques become more advanced and complex,
understanding how models arrive at specific predictions becomes crucial,
especially in aviation scenarios where decisions impact safety and
operational efficiency. Explainable AI in regression aims to demystify the
decision-making process of advanced models, providing insights into the
factors influencing predictions and fostering trust in the outcomes. This trend
aligns with the industry's growing emphasis on accountability and
comprehension of AI-driven regression analyses.
Regression Methods 30
Future Trends
• In RapidMiner, using the Repository window, follow the path
Training Resources-Model-Supervised-Linear Regression
and open the Hotel App CLV Linear Regression solution
process.
• In this example, the outcome (label) variable is «Customer
Lifetime Value (CLV)».
• CLV is the total revenue or profit generated by a customer
over the entire course of their relationship with your business.
• Predicting CLV using regression is of paramount importance
in the field of business and marketing.
• Since CLV variable is also realted with airline business, so
we will use this regression example.
Regression Methods 31
RapidMiner Example on Linear Regression Model
• In the process window of this model, you can see main operators for data importing, cross
validation. The cross validation operator also includes the model operators (train and
test), data split function and performance operator.
Regression Methods 32
RapidMiner Example on Linear Regression Model
• When you double-click the cross validation operator, you can see two main areas for
training and testing. You can simply drag and drop model and performance operators in
these areas.
Regression Methods 33
RapidMiner Example on Linear Regression Model
Cross Validation is a technique used for assessing the performance and generalization ability of a
predictive model. Cross-validation helps to evaluate how well a model will perform on an independent
dataset by partitioning the available data into subsets. The basic idea is to train and test the model
multiple times on different subsets of the data, and then average the performance metrics to get a
more reliable estimate of the model's performance. This operator:
• Splits data into k subsets (folds), typically with a value of k such as 5 or 10.
• The model is trained on k-1 folds and tested on the remaining fold. This process is repeated k
times, each time using a different fold as the test set.
• The performance metrics are recorded for each iteration.
• The average performance across all iterations is calculated to provide a more robust estimate of the
model's performance.
• You can simply arrange «number of folds (k)» and «sampling type» using the operator parameters
window on the left.
Regression Methods 34
RapidMiner Example on Linear Regression Model
• After you run the model,
you can find the outputs
in the Results view.
• You can find coefficients
of each predictor and p-
values of these
variables in the
regression model
equation.
Regression Methods 35
RapidMiner Example on Linear Regression Model
• You can find the values of the performance indicators like MSE, RMSE, R-squared etc.
Also, please keep in mind that you can get more performance indicators choosing the
options in the performance operator.
Regression Methods 36
RapidMiner Example on Linear Regression Model
• You can find the predictions for each row in the Results view. These prediction data can
be used for futher visualization or discussion.
Regression Methods 37
RapidMiner Example on Linear Regression Model
• For example you can create graphs using the visualizations tab in the left menu options.
Regression Methods 38
RapidMiner Example on Linear Regression Model
After running the regression model you can:
• Explain the relationship between variables.
• Obtain predicted values for the label attribute.
• Get performance metrics for the model and interpret.
• Visualize the outputs.
• Get statistical outputs for the model.
Regression Methods 39
RapidMiner Example on Linear Regression Model
• In conclusion, this chapter provides a comprehensive exploration of
Regression Methods in aviation.
• With a solid understanding of predictive modeling, regression models
offers a robust toolkit for predicting numerical outcomes and optimizing
various aspects within the aviation industry.
• In order to enhance your proficiency in leveraging regression techniques
for nuanced problem-solving in aviation analytics, not only grasp these
foundational concepts but to actively apply them in practical scenarios
using RapidMiner.
Regression Methods 40
Conclusion

More Related Content

Similar to ml-05x01.pdf

Linear Regression
Linear RegressionLinear Regression
Linear Regression
Abdullah al Mamun
 
IEOR 265 Final Paper_Minchao Lin
IEOR 265 Final Paper_Minchao LinIEOR 265 Final Paper_Minchao Lin
IEOR 265 Final Paper_Minchao Lin
Minchao Lin
 
Performance analysis of regularized linear regression models for oxazolines a...
Performance analysis of regularized linear regression models for oxazolines a...Performance analysis of regularized linear regression models for oxazolines a...
Performance analysis of regularized linear regression models for oxazolines a...
ijcsity
 
Guide for building GLMS
Guide for building GLMSGuide for building GLMS
Guide for building GLMS
Ali T. Lotia
 
Mixture Regression Model for Incomplete Data
Mixture Regression Model for Incomplete DataMixture Regression Model for Incomplete Data
Mixture Regression Model for Incomplete Data
Loc Nguyen
 

Similar to ml-05x01.pdf (20)

Linear Regression
Linear RegressionLinear Regression
Linear Regression
 
Forecasting Default Probabilities in Emerging Markets and Dynamical Regula...
Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regula...Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regula...
Forecasting Default Probabilities in Emerging Markets and Dynamical Regula...
 
Introduction-to-Non-Linear-Regression.pptx
Introduction-to-Non-Linear-Regression.pptxIntroduction-to-Non-Linear-Regression.pptx
Introduction-to-Non-Linear-Regression.pptx
 
IEOR 265 Final Paper_Minchao Lin
IEOR 265 Final Paper_Minchao LinIEOR 265 Final Paper_Minchao Lin
IEOR 265 Final Paper_Minchao Lin
 
Regression kriging
Regression krigingRegression kriging
Regression kriging
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Regression Analysis Techniques.pptx
Regression Analysis Techniques.pptxRegression Analysis Techniques.pptx
Regression Analysis Techniques.pptx
 
Regression & It's Types
Regression & It's TypesRegression & It's Types
Regression & It's Types
 
CFA II Quantitative Analysis
CFA II Quantitative AnalysisCFA II Quantitative Analysis
CFA II Quantitative Analysis
 
Regression analysis in HR
Regression analysis in HRRegression analysis in HR
Regression analysis in HR
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
 
Linear Regression Paper Review.pptx
Linear Regression Paper Review.pptxLinear Regression Paper Review.pptx
Linear Regression Paper Review.pptx
 
Operation's research models
Operation's research modelsOperation's research models
Operation's research models
 
Linear regression
Linear regressionLinear regression
Linear regression
 
NPTL Machine Learning Week 2.docx
NPTL Machine Learning Week 2.docxNPTL Machine Learning Week 2.docx
NPTL Machine Learning Week 2.docx
 
Performance analysis of regularized linear regression models for oxazolines a...
Performance analysis of regularized linear regression models for oxazolines a...Performance analysis of regularized linear regression models for oxazolines a...
Performance analysis of regularized linear regression models for oxazolines a...
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Guide for building GLMS
Guide for building GLMSGuide for building GLMS
Guide for building GLMS
 
Mixture Regression Model for Incomplete Data
Mixture Regression Model for Incomplete DataMixture Regression Model for Incomplete Data
Mixture Regression Model for Incomplete Data
 
Cmt learning objective 25 - regresseion
Cmt learning objective   25 - regresseionCmt learning objective   25 - regresseion
Cmt learning objective 25 - regresseion
 

More from NextGenATM Erasmus+ Project (20)

ml-09x01.pdf
ml-09x01.pdfml-09x01.pdf
ml-09x01.pdf
 
ml-08x01.pdf
ml-08x01.pdfml-08x01.pdf
ml-08x01.pdf
 
ml-07x01.pdf
ml-07x01.pdfml-07x01.pdf
ml-07x01.pdf
 
ml-06x01.pdf
ml-06x01.pdfml-06x01.pdf
ml-06x01.pdf
 
ml-04x01.pdf
ml-04x01.pdfml-04x01.pdf
ml-04x01.pdf
 
ml-03x01.pdf
ml-03x01.pdfml-03x01.pdf
ml-03x01.pdf
 
ml-02x01.pdf
ml-02x01.pdfml-02x01.pdf
ml-02x01.pdf
 
ml-01x01.pdf
ml-01x01.pdfml-01x01.pdf
ml-01x01.pdf
 
EAVA presentation.pdf
EAVA presentation.pdfEAVA presentation.pdf
EAVA presentation.pdf
 
ESTU presentation.pdf
ESTU presentation.pdfESTU presentation.pdf
ESTU presentation.pdf
 
HSW presentation.pdf
HSW presentation.pdfHSW presentation.pdf
HSW presentation.pdf
 
ts-07x01.pdf
ts-07x01.pdfts-07x01.pdf
ts-07x01.pdf
 
ts-06x01.pdf
ts-06x01.pdfts-06x01.pdf
ts-06x01.pdf
 
ts-05x01.pdf
ts-05x01.pdfts-05x01.pdf
ts-05x01.pdf
 
ts-04x01.pdf
ts-04x01.pdfts-04x01.pdf
ts-04x01.pdf
 
ts-03x01.pdf
ts-03x01.pdfts-03x01.pdf
ts-03x01.pdf
 
ts-02x01.pdf
ts-02x01.pdfts-02x01.pdf
ts-02x01.pdf
 
ts-01x01.pdf
ts-01x01.pdfts-01x01.pdf
ts-01x01.pdf
 
sa-07x01.pdf
sa-07x01.pdfsa-07x01.pdf
sa-07x01.pdf
 
sa-06x01.pdf
sa-06x01.pdfsa-06x01.pdf
sa-06x01.pdf
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 

ml-05x01.pdf

  • 1. This course is prepared under the Erasmus+ KA-210-YOU Project titled «Skilling Youth for the Next Generation Air Transport Management» Machine Learning Applications in Aviation Regression Methods Asst. Prof. Dr. Emircan Özdemir Eskişehir Technical University
  • 2. • Regression Methods play a crucial role in predicting numerical outcomes in the context of machine learning. • In aviation industry, forecasting fuel consumption, optimizing flight paths, predicting maintenance needs, anticipating passenger demand for a particular route, optimizing marketing strategies for increased ticket sales, or predicting customer satisfaction levels, regression methods offer a versatile toolkit for modeling intricate relationships between variables in the aviation domain. Regression Methods 2 Introduction
  • 3. • Simple Linear Regression Simple Linear Regression is a foundational regression method used in machine learning and statistics to model the relationship between two variables: one independent variable (predictor) and one dependent variable (outcome). The goal is to establish a linear equation that best fits the observed data points, allowing for predictions or estimations of the dependent variable based on changes in the independent variable. The equation of a simple linear regression line is typically expressed as: Regression Methods 3 Types of Regression Problems
  • 4. • Simple Linear Regression The regression analysis aims to find the values of b0 and b1 that minimize the difference between the observed and predicted values, often using methods like the least squares approach. Regression Methods 4 Types of Regression Problems Source: https://medium.com/@sachin.hs20/simple-linear-regression-using-example-e4e2a89df54c
  • 5. • Simple Linear Regression In aviation, this model could help predict fuel consumption based on the number of passengers, offering insights into how changes in passenger load may impact fuel efficiency. This, in turn, aids in optimizing flight operations, resource planning, and cost management. Regression Methods 5 Types of Regression Problems
  • 6. • Multiple Linear Regression Multiple Linear Regression is an extension of Simple Linear Regression, allowing for the modeling of the relationship between a dependent variable and multiple independent variables. In this method, the goal is to create a linear equation that best fits the observed data points by considering the impact of multiple predictors on the outcome variable. The general form of the Multiple Linear Regression equation is: Regression Methods 6 Types of Regression Problems
  • 7. • Multiple Linear Regression In the context of aviation, consider a scenario where the fuel consumption of an aircraft (dependent variable) is influenced not only by the number of passengers (as in Simple Linear Regression) but also by additional factors such as flight distance, weather conditions, and aircraft type. Multiple Linear Regression enables the creation of a model that incorporates all these variables to provide a more comprehensive understanding of how they collectively influence fuel consumption. Regression Methods 7 Types of Regression Problems Source: https://www.mathworks.com/help/stats/regress.html
  • 8. • Polynomial Regression Polynomial Regression is a type of regression analysis that extends the concept of linear regression by introducing polynomial terms, allowing for the modeling of non-linear relationships between the dependent and independent variables. In other words, it accommodates situations where the relationship between the variables cannot be adequately represented by a straight line. Regression Methods 8 Types of Regression Problems
  • 9. • Polynomial Regression The Polynomial Regression equation takes the form: Regression Methods 9 Types of Regression Problems
  • 10. • Polynomial Regression The choice of the degree of the polynomial (n) depends on the complexity of the relationship in the data. A higher-degree polynomial can fit the data more closely but may risk overfitting, capturing noise rather than true patterns. In the context of aviation, Polynomial Regression can be applied when predicting variables with non-linear trends. For example, forecasting the relationship between aircraft altitude and fuel efficiency might involve a polynomial model, as the impact of altitude on fuel efficiency may not follow a straight line. Polynomial Regression allows for a more flexible representation of such relationships, enhancing the accuracy of predictions in scenarios where non-linear patterns are evident. Regression Methods 10 Types of Regression Problems
  • 11. • Linear Regression It serves as the foundational algorithm for modeling linear relationships between variables. This method is particularly useful in scenarios where a clear, straight-line relationship can be established, such as predicting fuel consumption based on flight time. Regression Methods 11 Popular Regression Algorithms
  • 12. • Ridge Regression It introduces regularization techniques to improve model performance. In aviation, where datasets often exhibit multicollinearity among variables, Ridge Regression's regularization helps mitigate overfitting and enhances the model's generalizability, contributing to more accurate predictions. The aim of ridge regression is to minimize the sum of squared residuals (like ordinary linear regression), but with an additional penalty for the sum of squared coefficients (L2 regularization term). It helps prevent overfitting, particularly when there is multicollinearity among predictor variables. Regression Methods 12 Popular Regression Algorithms
  • 13. • Lasso Regression Lasso Regression, on the other hand, brings in L1 regularization, offering a unique advantage in feature selection. This is crucial in aviation scenarios where selecting the most relevant features can significantly impact the accuracy and efficiency of regression models. Lasso Regression aids in identifying and prioritizing the key variables that influence outcomes, such as predicting aircraft maintenance needs. Its aim is similar to Ridge Regression, but with an L1 regularization term, which penalizes the absolute values of the coefficients. It promotes sparsity in the coefficient values, effectively performing automatic feature selection by pushing certain coefficients to exactly zero. Regression Methods 13 Popular Regression Algorithms
  • 14. • Elastic Net Regression Elastic Net Regression is a hybrid regularization technique that combines aspects of both Ridge Regression and Lasso Regression. It incorporates both L1 (Lasso) and L2 (Ridge) regularization terms into the linear regression objective function. It aims to minimize the sum of squared residuals, like ordinary linear regression, including both L1 and L2 regularization terms to penalize the sum of absolute values (L1) and the sum of squared values (L2) of the coefficients. It’s suitable for situations where there are many correlated features, and automatic feature selection is desired. Regression Methods 14 Popular Regression Algorithms
  • 15. • Building the right model is important to avoid overfitting and underfitting. In In some cases, Ridge, Lasso and Elastic Net regression algorithms help us to make better long term predictions and avoid over/under fitting. Regression Methods 15 Overfitting and Underfitting Source: https://www.geeksforgeeks.org/lasso-vs-ridge-vs-elastic-net-ml/
  • 16. • Mean Squared Error (MSE) MSE is a regression evaluation metric that quantifies the average squared difference between predicted values and actual values in the dataset. A lower MSE indicates better model performance, with smaller errors between predicted and actual values. It provides a measure of the accuracy of the model's predictions. Regression Methods 16 Regression Evaluation Metrics
  • 17. • Root Mean Squared Error (RMSE) The RMSE is a commonly used metric for evaluating the accuracy of a regression model. It represents the square root of the average squared differences between the actual and predicted values. The lower the RMSE, the better the model's performance, as it indicates smaller prediction errors. Regression Methods 17 Regression Evaluation Metrics
  • 18. • Mean Absolute Error (MAE) MAE is another metric for evaluating the accuracy of a regression model. It measures the average absolute differences between the actual and predicted values. Unlike the squared differences in RMSE, MAE provides a more straightforward interpretation of the average prediction error. The lower the MAE, the better the model's performance in terms of absolute prediction accuracy. Regression Methods 18 Regression Evaluation Metrics
  • 19. • R-squared (𝑹𝟐) R-squared is a statistical metric that represents the proportion of the variance in the dependent variable that is predictable from the independent variables. It ranges from 0 to 1, where 1 indicates a perfect fit. A higher R-squared value signifies a better fit of the regression model to the data. It quantifies the proportion of variability in the dependent variable that can be explained by the independent variables. However, caution is needed, as high R-squared does not necessarily imply causation. Regression Methods 19 Regression Evaluation Metrics
  • 20. Conclusion: • The lower value of MAE, MSE, and RMSE implies higher accuracy of a regression model. • However, a higher value of R-squared is considered desirable. Regression Methods 20 Regression Evaluation Metrics
  • 21. • Predictive Maintenance Regression is employed to forecast maintenance needs based on historical data. By analyzing patterns and trends in past maintenance records, aviation companies can proactively schedule maintenance activities, reducing the risk of unplanned downtime and improving overall operational efficiency. • Predictors: Historical maintenance records, Sensor data from aircraft components, Flight hours and cycles • Outcome(s): Predicted time until the next maintenance event, Identification of specific components requiring attention Regression Methods 21 Regression Use Cases in Aviation
  • 22. • Fuel Consumption Optimization Regression models play a crucial role in optimizing fuel efficiency in aviation. By considering factors such as aircraft type, weather conditions, and flight routes, regression helps in developing models that guide fuel consumption strategies. This contributes to cost savings and environmental sustainability. • Predictors: Aircraft type and mode, Weather conditions, Flight route and altitude • Outcome(s): Optimal fuel consumption rate, Cost-effective fuel usage strategies Regression Methods 22 Regression Use Cases in Aviation
  • 23. • Flight Time Prediction Regression is utilized for estimating flight durations and enhancing scheduling processes. By analyzing various variables such as departure and arrival locations, historical data, and potential delays, regression models assist in predicting more accurate flight times, improving overall airline scheduling efficiency. • Predictors: Departure and arrival locations, Historical flight data Weather forecasts • Outcome(s): Predicted flight duration, More accurate scheduling of arrival and departure times Regression Methods 23 Regression Use Cases in Aviation
  • 24. • Passenger Loyalty Prediction Airlines can leverage regression methods to analyze historical data related to passenger behavior, including factors like booking frequency, travel preferences, and demographic information. By building regression models, airlines can predict the likelihood of passengers remaining loyal to their services. This allows for targeted strategies to enhance passenger experience, offer personalized incentives, and ultimately foster stronger customer loyalty in the highly competitive aviation industry. • Predictors: Historical passenger behavior, Booking frequency, Travel preferences and patterns, Demographic information • Outcome(s): Predicted likelihood of a passenger choosing the airline for future travels, Identification of factors influencing passenger loyalty Regression Methods 24 Regression Use Cases in Aviation
  • 25. • Feature Selection The selection of features plays a pivotal role. Features are the variables used to predict the outcome, and choosing relevant ones is crucial for accurate regression models. Aviation datasets may contain numerous variables, and careful consideration is needed to identify which features significantly contribute to the predictive power of the model. Feature selection ensures that the chosen variables align with the specific objectives of the regression task, enhancing the model's efficacy and interpretability. Regression Methods 25 Considerations and Challenges
  • 26. • Multicollinearity Multicollinearity refers to the presence of correlated predictor variables in a regression model. In aviation datasets, certain features may exhibit high correlation, potentially posing challenges. When predictor variables are highly correlated, it becomes difficult to distinguish the individual impact of each variable on the outcome. Addressing multicollinearity is essential for maintaining the stability and reliability of regression models. Techniques such as variance inflation factor (VIF) analysis can be employed to identify and mitigate multicollinearity, ensuring the robustness of the regression analysis in aviation applications. Regression Methods 26 Considerations and Challenges
  • 27. • Data Preprocessing Data preprocessing is a critical step in aviation regression, emphasizing the importance of cleaning and preparing data before applying regression models. Aviation datasets can be complex, containing various variables with missing values, outliers, or inconsistencies. Data preprocessing techniques involve handling missing data, addressing outliers, and transforming variables to ensure they align with the assumptions of regression models. By cleaning and preparing the data meticulously, analysts create a robust foundation for regression analysis, enhancing the accuracy and reliability of the subsequent modeling. Regression Methods 27 Best Practices
  • 28. • Model Selection Model selection is a key facet of successful aviation regression. Different regression algorithms may be suited to varying aviation problems, and selecting the right algorithm is crucial for achieving accurate and meaningful results. Considerations such as the nature of the data, the relationship between variables, and the specific goals of the regression task guide the choice of regression algorithm. Whether it's linear regression, ridge regression, or other advanced techniques, thoughtful model selection ensures that the regression model aligns optimally with the characteristics of the aviation dataset and the objectives of the analysis. Regression Methods 28 Best Practices
  • 29. • Ensemble Regression Models Ensemble regression models represent a future trend in aviation that involves the exploration of combining multiple regression models for enhanced accuracy and robust predictions. Ensemble methods, such as Random Forest or Gradient Boosting, leverage the strength of diverse models to mitigate individual weaknesses and improve overall performance. By aggregating predictions from multiple models, ensemble techniques offer a sophisticated approach to handle complex relationships within aviation datasets, contributing to more reliable regression outcomes. Regression Methods 29 Future Trends
  • 30. • Explainable AI in Regression: The evolving trend of Explainable AI in regression underscores the increasing importance of transparency and interpretability in regression models. As regression techniques become more advanced and complex, understanding how models arrive at specific predictions becomes crucial, especially in aviation scenarios where decisions impact safety and operational efficiency. Explainable AI in regression aims to demystify the decision-making process of advanced models, providing insights into the factors influencing predictions and fostering trust in the outcomes. This trend aligns with the industry's growing emphasis on accountability and comprehension of AI-driven regression analyses. Regression Methods 30 Future Trends
  • 31. • In RapidMiner, using the Repository window, follow the path Training Resources-Model-Supervised-Linear Regression and open the Hotel App CLV Linear Regression solution process. • In this example, the outcome (label) variable is «Customer Lifetime Value (CLV)». • CLV is the total revenue or profit generated by a customer over the entire course of their relationship with your business. • Predicting CLV using regression is of paramount importance in the field of business and marketing. • Since CLV variable is also realted with airline business, so we will use this regression example. Regression Methods 31 RapidMiner Example on Linear Regression Model
  • 32. • In the process window of this model, you can see main operators for data importing, cross validation. The cross validation operator also includes the model operators (train and test), data split function and performance operator. Regression Methods 32 RapidMiner Example on Linear Regression Model
  • 33. • When you double-click the cross validation operator, you can see two main areas for training and testing. You can simply drag and drop model and performance operators in these areas. Regression Methods 33 RapidMiner Example on Linear Regression Model
  • 34. Cross Validation is a technique used for assessing the performance and generalization ability of a predictive model. Cross-validation helps to evaluate how well a model will perform on an independent dataset by partitioning the available data into subsets. The basic idea is to train and test the model multiple times on different subsets of the data, and then average the performance metrics to get a more reliable estimate of the model's performance. This operator: • Splits data into k subsets (folds), typically with a value of k such as 5 or 10. • The model is trained on k-1 folds and tested on the remaining fold. This process is repeated k times, each time using a different fold as the test set. • The performance metrics are recorded for each iteration. • The average performance across all iterations is calculated to provide a more robust estimate of the model's performance. • You can simply arrange «number of folds (k)» and «sampling type» using the operator parameters window on the left. Regression Methods 34 RapidMiner Example on Linear Regression Model
  • 35. • After you run the model, you can find the outputs in the Results view. • You can find coefficients of each predictor and p- values of these variables in the regression model equation. Regression Methods 35 RapidMiner Example on Linear Regression Model
  • 36. • You can find the values of the performance indicators like MSE, RMSE, R-squared etc. Also, please keep in mind that you can get more performance indicators choosing the options in the performance operator. Regression Methods 36 RapidMiner Example on Linear Regression Model
  • 37. • You can find the predictions for each row in the Results view. These prediction data can be used for futher visualization or discussion. Regression Methods 37 RapidMiner Example on Linear Regression Model
  • 38. • For example you can create graphs using the visualizations tab in the left menu options. Regression Methods 38 RapidMiner Example on Linear Regression Model
  • 39. After running the regression model you can: • Explain the relationship between variables. • Obtain predicted values for the label attribute. • Get performance metrics for the model and interpret. • Visualize the outputs. • Get statistical outputs for the model. Regression Methods 39 RapidMiner Example on Linear Regression Model
  • 40. • In conclusion, this chapter provides a comprehensive exploration of Regression Methods in aviation. • With a solid understanding of predictive modeling, regression models offers a robust toolkit for predicting numerical outcomes and optimizing various aspects within the aviation industry. • In order to enhance your proficiency in leveraging regression techniques for nuanced problem-solving in aviation analytics, not only grasp these foundational concepts but to actively apply them in practical scenarios using RapidMiner. Regression Methods 40 Conclusion