The aim of the project is to predict landing overrun for an aircraft given the airstrip length of the airport. The methodology used is quadratic linear regression. Mean R-squared of 98% is achieved with MSE being just 9% of the mean landing distance.
Analytical Evaluation of Generalized Predictive Control Algorithms Using a Fu...inventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
The objective of this analysis is to quantify the factors that impact the landing distance of a commercial flight and built a linear regression model to predict the risk of landing overrun.
M.G.Goman, A.V.Khramtsovsky (2008) - Computational framework for investigatio...Project KRIT
М.Г.Гоман, А.В.Храмцовский «Методика численного исследования нелинейной динамики самолёта», Advances in Engineering Software 39 (2008), стр..167-177
M.G.Goman, A.V.Khramtsovsky "Computational framework for investigation of aircraft nonlinear dynamics", Advances in Engineering Software 39 (2008) pp.167-177
A computational framework based on qualitative theory, parameter continuation and bifurcation analysis is outlined and illustrated by a number of examples for inertia coupled roll maneuvers. The focus is on the accumulation of computed results in a special database and its incorporation into the investigation process. Ways to automate the investigation of aircraft nonlinear dynamics are considered.
Описана и проиллюстрирована на ряде примеров (связанных с инерционным вращением) методика численного анализа, основанная на теории качественного анализа, методе продолжения решений, зависящих от параметра и на бифуркационном анализе. Особое внимание уделяется накоплению результатов расчетов в специальной базе данных и интеграции этой базы в процесс исследований. Рассмотрены способы автоматизации исследований нелинейной динамики самолёта.
M.Goman et al (1993) - Aircraft Spin Prevention / Recovery Control SystemProject KRIT
М.Г.Гоман, А.В.Храмцовский, В.Л.Суханов, В.А.Сыроватский, К.А.Татарников «Система предотвращения попадания / вывода самолёта из штопора», доклад на 3-й российско-китайской научной конференции по аэродинамике и динамике полёта самолёта, Центральный Аэрогидродинамический институт (ЦАГИ), г.Жуковский, 1993 г., 14 стр.
M.Goman, A.Khramtsovsky, V.Soukhanov, V.Syrovatsky and K.Tatarnikov "Aircraft Spin Prevention/Recovery Control System", presented at the Third Russian-Chinese Scientific Conference on Aerodynamics and Flight Dynamics of Aircraft, Central Aerohydrodynamic Institute (TsAGI), Zhukovsky, Moscow region, Russia, November 9-12, 1993, 14 pp.
Data-driven adaptive predictive control for an activated sludge processjournalBEEI
Data-driven control requires no information of the mathematical model of the controlled process. This paper proposes the direct identification of controller parameters of activated sludge process. This class of data-driven control calculates the predictive controller parameters directly using subspace identification technique. By updating input-output data using receding window mechanism, the adaptive strategy can be achieved. The robustness test and stability analysis of direct adaptive model predictive control are discussed to realize the effectiveness of this adaptive control scheme. The applicability of the controller algorithm to adapt into varying kinetic parameters and operating conditions is evaluated. Simulation results show that by a proper and effective excitation of direct identification of controller parameters, the convergence and stability of the implicit predictive model can be achieved.
Calibration of Environmental Sensor Data Using a Linear Regression Techniqueijtsrd
The linear regression is a linear approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables denoted X. A linear regression analysis method is used to improve the performance of the instruments using light scattering method. The data measured by these instruments have to be converted to actual PM10 concentrations using some factors. These findings propose that the instruments using light scattering method can be used to measure and control the PM10 concentrations of the underground subway stations. In this study, the reliability of the instrument for PM measurement using light scattering method was improved with the help of a linear regression analysis technique to continuously measure the PM10 concentrations in the subway stations. In addition, the linear regression analysis technique was also applied to calibrate a radon counter with the help of an expensive radon measuring apparatus. Chungyong Kim | Gyu-Sik Kim"Calibration of Environmental Sensor Data Using a Linear Regression Technique" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: http://www.ijtsrd.com/papers/ijtsrd7060.pdf http://www.ijtsrd.com/engineering/electrical-engineering/7060/calibration-of-environmental-sensor-data-using-a-linear-regression-technique/chungyong-kim
Analytical Evaluation of Generalized Predictive Control Algorithms Using a Fu...inventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
The objective of this analysis is to quantify the factors that impact the landing distance of a commercial flight and built a linear regression model to predict the risk of landing overrun.
M.G.Goman, A.V.Khramtsovsky (2008) - Computational framework for investigatio...Project KRIT
М.Г.Гоман, А.В.Храмцовский «Методика численного исследования нелинейной динамики самолёта», Advances in Engineering Software 39 (2008), стр..167-177
M.G.Goman, A.V.Khramtsovsky "Computational framework for investigation of aircraft nonlinear dynamics", Advances in Engineering Software 39 (2008) pp.167-177
A computational framework based on qualitative theory, parameter continuation and bifurcation analysis is outlined and illustrated by a number of examples for inertia coupled roll maneuvers. The focus is on the accumulation of computed results in a special database and its incorporation into the investigation process. Ways to automate the investigation of aircraft nonlinear dynamics are considered.
Описана и проиллюстрирована на ряде примеров (связанных с инерционным вращением) методика численного анализа, основанная на теории качественного анализа, методе продолжения решений, зависящих от параметра и на бифуркационном анализе. Особое внимание уделяется накоплению результатов расчетов в специальной базе данных и интеграции этой базы в процесс исследований. Рассмотрены способы автоматизации исследований нелинейной динамики самолёта.
M.Goman et al (1993) - Aircraft Spin Prevention / Recovery Control SystemProject KRIT
М.Г.Гоман, А.В.Храмцовский, В.Л.Суханов, В.А.Сыроватский, К.А.Татарников «Система предотвращения попадания / вывода самолёта из штопора», доклад на 3-й российско-китайской научной конференции по аэродинамике и динамике полёта самолёта, Центральный Аэрогидродинамический институт (ЦАГИ), г.Жуковский, 1993 г., 14 стр.
M.Goman, A.Khramtsovsky, V.Soukhanov, V.Syrovatsky and K.Tatarnikov "Aircraft Spin Prevention/Recovery Control System", presented at the Third Russian-Chinese Scientific Conference on Aerodynamics and Flight Dynamics of Aircraft, Central Aerohydrodynamic Institute (TsAGI), Zhukovsky, Moscow region, Russia, November 9-12, 1993, 14 pp.
Data-driven adaptive predictive control for an activated sludge processjournalBEEI
Data-driven control requires no information of the mathematical model of the controlled process. This paper proposes the direct identification of controller parameters of activated sludge process. This class of data-driven control calculates the predictive controller parameters directly using subspace identification technique. By updating input-output data using receding window mechanism, the adaptive strategy can be achieved. The robustness test and stability analysis of direct adaptive model predictive control are discussed to realize the effectiveness of this adaptive control scheme. The applicability of the controller algorithm to adapt into varying kinetic parameters and operating conditions is evaluated. Simulation results show that by a proper and effective excitation of direct identification of controller parameters, the convergence and stability of the implicit predictive model can be achieved.
Calibration of Environmental Sensor Data Using a Linear Regression Techniqueijtsrd
The linear regression is a linear approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables denoted X. A linear regression analysis method is used to improve the performance of the instruments using light scattering method. The data measured by these instruments have to be converted to actual PM10 concentrations using some factors. These findings propose that the instruments using light scattering method can be used to measure and control the PM10 concentrations of the underground subway stations. In this study, the reliability of the instrument for PM measurement using light scattering method was improved with the help of a linear regression analysis technique to continuously measure the PM10 concentrations in the subway stations. In addition, the linear regression analysis technique was also applied to calibrate a radon counter with the help of an expensive radon measuring apparatus. Chungyong Kim | Gyu-Sik Kim"Calibration of Environmental Sensor Data Using a Linear Regression Technique" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: http://www.ijtsrd.com/papers/ijtsrd7060.pdf http://www.ijtsrd.com/engineering/electrical-engineering/7060/calibration-of-environmental-sensor-data-using-a-linear-regression-technique/chungyong-kim
Potential Benefits and Implementation of MM5 and RUC2 Data with the CALPUFF A...BREEZE Software
At the absolute minimum, CALMET (CALPUFF’s meteorological preprocessor) requires hourly measurements of surface meteorological data and twice-daily upper air data soundings.
This paper develops a hybrid Measure-Correlate-Predict (MCP) strategy to predict the long term wind resource variations at a farm site. The hybrid MCP method uses the recorded data of multiple reference stations to estimate the long term wind condition at the target farm site. The weight of each reference station in the hybrid strategy is determined based on: (i) the distance and (ii) the elevation difference between the target farm site and each reference station. The applicability of the proposed hybrid strategy is investigated using four different MCP methods: (i) linear regression; (ii) variance ratio; (iii) Weibull scale; and (iv) Artificial Neural Networks (ANNs). To implement this method, we use the hourly averaged wind data recorded at six stations in North Dakota between the year 2008 and 2010. The station Pillsbury is selected as the target farm site. The recorded data at the other five stations
(Dazey, Galesbury, Hillsboro, Mayville and Prosper) is used as reference station data. Three sets of performance metrics are used to evaluate the hybrid MCP method. The first set of metrics analyze the statistical performance, including the mean wind speed, the wind speed variance, the root mean squared error, and the maximum absolute error. The second set of metrics evaluate the distribution of long term wind speed; to this end, the Weibull distribution and the Multivariate and Multimodal Wind Distribution (MMWD) models are adopted in this paper. The third set of metrics analyze the energy production capacity and the efficiency of the wind farm. The results illustrate that the many-to-one correlation in such a hybrid approach can provide more reliable prediction of the long term onsite wind variations, compared to one-to-one correlations.
М.Г.Гоман, А.В.Храмцовский, М.Шапиро «Разработка моделей аэродинамики и моделирование динамики самолета на больших углах атаки», доклад на международной конференции «Тренажерные технологии и обучение», прошедей в ЦАГИ, г.Жуковский, 24-25 мая 2001 г.
M.Goman, A.Khramtsovsky and M.Shapiro "Aerodynamics Modeling and Dynamics Simulation at High Angles of Attack", presentation at the International conference on Simulation Technology & Training held at TsAGI, Zhukovsky (Russia), on 24 May 2001.
CFD Analysis of ATGM Configurations -- Zeus NumerixAbhishek Jain
Above Research Paper can be downloaded from www.zeusnumerix.com
The research paper aims to study the aerodynamic configuration of anti-tank guided missile (ATGM). Effect of rotation of the ATGM on it own axis is simulated along with the curved fins. Areas of flow separation and higher drag are visualized using the variation of axial force along the length. The paper emphasizes on the contribution of base drag to the total drag. Authors A Venkateshwarlu and Hem Raj (BDL), Kumar Mihir and Sanjay Kumar (Zeus Numerix), Prof KE Prasad JNTU.
Development of Speed Profile Model for Two-Lane Rural Roadinventionjournals
It's desirable to make road design match with driver's expectation and consistency of road design implies harmonized connection among road design elements. A road alignment safety evaluation based on road design consistency which has been accelerated recently is designed to calculate the variation of driving speed depending on geometric structure based on operating speed profile and the result is used as the index in evaluating the design consistency. Instead of the safety of individual road design elements (tangent or curve), consistency of continuous design elements is rather stressed. Under such background, driving speed prediction model at horizontal curve/tangent and acceleration/deceleration model at entry and exit of horizontal curve were developed in this study in a bid to evaluate the alignment safety of a 2-lane rural road, which will be used to supplement current road design standard to operating speed-based standard so as to make commitment to designing the road from driver's standpoint, not the viewpoint of road designer or policy-maker.
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
A METHOD OF TARGET TRACKING AND PREDICTION BASED ON GEOMAGNETIC SENSOR TECHNO...cscpconf
In view of the inherent defects in current airport surface surveillance system, this paper
proposes an asynchronous target-perceiving-event driven surface target surveillance scheme
based on the geomagnetic sensor technology. Furthermore, a surface target tracking and
prediction algorithm based on I-IMM is given, which is improved on the basis of IMM
algorithm in the following aspects: Weighted sum is performed on the mean of residual errors
and model probabilistic likelihood function is reconstructed, thus increasing the identification
of a true motion model; Fixed model transition probability is updated with model posterior
information, thus accelerating model switching as well as increasing the identification of a
model. In the period when a target is non-perceptible, prediction of target trajectories can be
implemented through the target motion model identified with I-IMM algorithm. Simulation
results indicate that I-IMM algorithm is more effective and advantageous in comparison with
the standard IMM algorithm.
Studied the factors that impact the landing distance of a commercial flight and built a linear regression model to predict the risk of landing overrun.
Tools used: SAS
Failure Diagnostic and Performance Monitoring, it is a part of CM program for airlines, it is addressing the condition of components of the aircraft, either it is initial , random or wear failures.
Potential Benefits and Implementation of MM5 and RUC2 Data with the CALPUFF A...BREEZE Software
At the absolute minimum, CALMET (CALPUFF’s meteorological preprocessor) requires hourly measurements of surface meteorological data and twice-daily upper air data soundings.
This paper develops a hybrid Measure-Correlate-Predict (MCP) strategy to predict the long term wind resource variations at a farm site. The hybrid MCP method uses the recorded data of multiple reference stations to estimate the long term wind condition at the target farm site. The weight of each reference station in the hybrid strategy is determined based on: (i) the distance and (ii) the elevation difference between the target farm site and each reference station. The applicability of the proposed hybrid strategy is investigated using four different MCP methods: (i) linear regression; (ii) variance ratio; (iii) Weibull scale; and (iv) Artificial Neural Networks (ANNs). To implement this method, we use the hourly averaged wind data recorded at six stations in North Dakota between the year 2008 and 2010. The station Pillsbury is selected as the target farm site. The recorded data at the other five stations
(Dazey, Galesbury, Hillsboro, Mayville and Prosper) is used as reference station data. Three sets of performance metrics are used to evaluate the hybrid MCP method. The first set of metrics analyze the statistical performance, including the mean wind speed, the wind speed variance, the root mean squared error, and the maximum absolute error. The second set of metrics evaluate the distribution of long term wind speed; to this end, the Weibull distribution and the Multivariate and Multimodal Wind Distribution (MMWD) models are adopted in this paper. The third set of metrics analyze the energy production capacity and the efficiency of the wind farm. The results illustrate that the many-to-one correlation in such a hybrid approach can provide more reliable prediction of the long term onsite wind variations, compared to one-to-one correlations.
М.Г.Гоман, А.В.Храмцовский, М.Шапиро «Разработка моделей аэродинамики и моделирование динамики самолета на больших углах атаки», доклад на международной конференции «Тренажерные технологии и обучение», прошедей в ЦАГИ, г.Жуковский, 24-25 мая 2001 г.
M.Goman, A.Khramtsovsky and M.Shapiro "Aerodynamics Modeling and Dynamics Simulation at High Angles of Attack", presentation at the International conference on Simulation Technology & Training held at TsAGI, Zhukovsky (Russia), on 24 May 2001.
CFD Analysis of ATGM Configurations -- Zeus NumerixAbhishek Jain
Above Research Paper can be downloaded from www.zeusnumerix.com
The research paper aims to study the aerodynamic configuration of anti-tank guided missile (ATGM). Effect of rotation of the ATGM on it own axis is simulated along with the curved fins. Areas of flow separation and higher drag are visualized using the variation of axial force along the length. The paper emphasizes on the contribution of base drag to the total drag. Authors A Venkateshwarlu and Hem Raj (BDL), Kumar Mihir and Sanjay Kumar (Zeus Numerix), Prof KE Prasad JNTU.
Development of Speed Profile Model for Two-Lane Rural Roadinventionjournals
It's desirable to make road design match with driver's expectation and consistency of road design implies harmonized connection among road design elements. A road alignment safety evaluation based on road design consistency which has been accelerated recently is designed to calculate the variation of driving speed depending on geometric structure based on operating speed profile and the result is used as the index in evaluating the design consistency. Instead of the safety of individual road design elements (tangent or curve), consistency of continuous design elements is rather stressed. Under such background, driving speed prediction model at horizontal curve/tangent and acceleration/deceleration model at entry and exit of horizontal curve were developed in this study in a bid to evaluate the alignment safety of a 2-lane rural road, which will be used to supplement current road design standard to operating speed-based standard so as to make commitment to designing the road from driver's standpoint, not the viewpoint of road designer or policy-maker.
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
A METHOD OF TARGET TRACKING AND PREDICTION BASED ON GEOMAGNETIC SENSOR TECHNO...cscpconf
In view of the inherent defects in current airport surface surveillance system, this paper
proposes an asynchronous target-perceiving-event driven surface target surveillance scheme
based on the geomagnetic sensor technology. Furthermore, a surface target tracking and
prediction algorithm based on I-IMM is given, which is improved on the basis of IMM
algorithm in the following aspects: Weighted sum is performed on the mean of residual errors
and model probabilistic likelihood function is reconstructed, thus increasing the identification
of a true motion model; Fixed model transition probability is updated with model posterior
information, thus accelerating model switching as well as increasing the identification of a
model. In the period when a target is non-perceptible, prediction of target trajectories can be
implemented through the target motion model identified with I-IMM algorithm. Simulation
results indicate that I-IMM algorithm is more effective and advantageous in comparison with
the standard IMM algorithm.
Studied the factors that impact the landing distance of a commercial flight and built a linear regression model to predict the risk of landing overrun.
Tools used: SAS
Failure Diagnostic and Performance Monitoring, it is a part of CM program for airlines, it is addressing the condition of components of the aircraft, either it is initial , random or wear failures.
FAA Flight Landing Distance Forecasting and AnalysisQuynh Tran
The overall goal of this project is to get an ideal model to forecast landing distance based on variables given in the dataset. To be able to come up with a good model that fits the dataset, we need to go through some certain steps to explore, clean, visualize, and analyze values in the dataset.
Multiple Dependent State Repetitive Sampling(MDSRS) plan is a combination of
Multiple deferred state sampling plan as well as repetitive sampling plan.This paper
deals with multiple dependent repetitive state sampling plan for certain attribute
control chart with respect to Poisson, gamma Poisson, fuzzy Poisson andfuzzy
Binomial distributions. The average run length for various distributions in MDSRS
are tabulated. Graphical illustrations are also made. The average run lengths are
compared for smaller shifts in the process using control charts for different parameter
values. The proposed method will be much useful in industry during monitoring of
manufacturing process. An example of earthquake data set from UCI respiratory is
considered and the average run length is computed based on fuzzy Poisson
distribution.
Automatic Landing of a UAV Using Model Predictive Control for the Surveillanc...AM Publications
The objective of this paper is to provide safe landing of an UAV on any demanding conditions and can be very useful during the period of floods, rescuing the affected people. It can be also be useful in military operations. This system provides stable method for landing of an aircraft. Image processing technique is used to capture the image and find the angle of inclinations. MATLAB produces four types of image through Image processing technique. They are Gray image, Edge Detection image, Histogram image, object detection image. MATLAB sends the histogram signal to microcontroller using UART. It is used to convert the received parallel data into serial data for transmitting the data to longer distance. Relay drives the DC motor. Angle of inclinations is measured according to the detected image of the surface. MATLAB produces the total inclination value of the surface. DC motor automatically adjusts the landing gear according to the obtained input inclination value of the surface. An UAV is programmed to adjust the landing gear according to angle of inclination. When the landing surface is detected for landing the aircraft, it automatically adjusts the landing gear and lands the aircraft according to the inclined surface of the landing area.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Predicting aircraft landing overruns using quadratic linear regression
1. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
SUMMARY
This project has been conducted with the objective of predicting landing overruns for aircrafts using the
given data.
Landing overruns are a major problem in the aviation industry as they may can damage to aircrafts as
well as the crew and passengers on-board. In past, many aircrafts have fallen victim to this and hence,
this project aims at reducing risk of overrun by predicting if an aircraft may have a landing overrun or
not based on certain parameters.
The given data is in the form of two tables containing 950 observations in total and having seven
variables namely Aircraft, Distance, Duration, Ground Speed, Air Speed, Height and Pitch. After cleaning
the data and performing sanity checks according to a given set of conditions, only 780 observations
were found to be fit for modeling.
A linear regression model was fit to the data and factors Aircraft Type, Ground Speed, square of Ground
Speed and Height were found to be important. Air Speed was removed from the modeling process
because of having 75% missing values. Comparison of regression models by aircraft type has also been
done in the process.
The linear equation was found to be:
Distance = 2187.5 + (-401.95) * Aircraft Code + (-69) * Ground Speed + 0.69 * Square of
Ground Speed + 13.66 * Height
Here aircraft code = 1 for “airbus” and 0 for “boeing”. The r-squared value for the model was 0.98 and
Root mean squared distance on the entire dataset was 135.69 metres.
The model was checked for diagnostics and the residuals were found to follow the assumptions of
Independence, 0 mean, normal distribution and constant variance.
Due to high R-squared of the model, the model has been checked for overfitting by performing
validation on testing dataset after splitting the given dataset into testing and training. The model was
found to be consistent across different sets of test dataset. The details of validation have been left out
from this report due to its limited scope.
As a result, using this model, given the required set of parameters, we can accurately predict the landing
distance of an aircraft and hence its possibility of overrun if we have the prior knowledge of airstrip
length. This information can be used to intimate pilots in advance of any risk of overrun before they
land.
2. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Chapter – 1
DATA EXPLORATION AND DATA CLEANING
OBJECTIVE:
The objective of this exercise is to perform sanity checks on the data, report inconsistencies and
undertake measures to clean the data.
PROCEDURE:
Data Exploration is a step by step procedure to understand the quality and parameters of data. Here, I
have combined both the data sources and run then explored the aggregated table. The steps are as
follows:
1. Uploading the data on the SAS On-Demand server.
3. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
2. Combining the datasets
3. Performing a univariate analysis on each variable (Frequency on categorical variables)
a. Categorical Variable: Aircraft
4. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
b. Numerical Variables:
The
detailed output for PROC MEANS.
Below is the summary of important parameters of these 7 variables:
Note: Since its very important to observe the percentage of outliers of each variable before removing
them so that we can keep a track of how many records are removed because of which variable, I am
looking at the distributions of the variables and their outliers before cleaning the data.
4. Notable Observations:
a. General Observations:
i. No unique key such as ‘Flight ID’ is present in the datasets.
ii. Dataset FAA2 has first 100 observations having same values as those of FAA1. These
can be removed after consent of the client.
b. Variable-specific observations:
i. Categorical Variable: Aircraft
1. ‘Aircraft’ is a categorical variable and the records are well distributed
between the two aircraft types – boeing and airbus, thus ensuring no bias in
terms of proportion.
2. However, there are be a difference in the parameters of these two types of
aircrafts which can be explored further in Bivariate analysis.
5. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
ii. Numerical Variables:
1. Duration:
a. There are 15.79 % missing values of duration.
b. It is given that duration of a flight should always be > 40 mins but from data we can
see that the minimum duration is 14.76 mins. Hence, records with duration <= 40
min are outliers. This percentage is very small(0.63%) as we can see from the
histogram.
c. Variable is normally distributed.
2. No. of Passengers:
a. No missing values
b. Variable is normally distributed
6. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
3. Ground Speed:
a. No missing values
b. Given range of ground speed = 30 – 140 . Data outside this interval is outlier(1.59%).
c. Data looks normally distributed.
4. Air Speed :
a. 74.84 % missing values
b. Valid range of ground speed = 30 – 140. Values outside this range are outliers
(0.84%).
7. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
c. Variable is left skewed.
5. Height:
a. No missing values.
b. Minimum height required = 6 m. Values less than this are outliers(1.26%).
c. Variable is normally distributed.
6. Pitch:
a. No missing values.
b. Data is normally distributed.
8. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
7. Distance:
a. No missing values.
b. Length is typically less than 6000m but more runway is not a problem for overrun
risk.
c. Data is left skewed but the percentage of longer runways is very less.
9. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
DATA CLEANING:
On the basis of the above observations, the data can be cleaned using the given rules (mentioned in the
above observations as well).
Deduping Data:
SAS Code:
Output:
Variable – wise filtering:
In addition to filtering every variable as given in the problem statement, I have removed Distance < 100
as passenger airplanes cannot land in such small distances and Distance >= 6000 due to abnormality. As
a result, 2 additional rows have been removed.
SAS Code:
Output:
10. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
780 observations are left after cleaning.
Summarizing cleaned data:
SAS Code :
Output:
11. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
DECISION/CONCLUSION:
1. 100 rows of FAA2 have same values as those of FAA1. These have been removed.
2. Removing missing duration values:
o Total rows = 850
Missing duration values = 50 (5.9%) which is within limits and hence should be removed.
3. Air Speed should not be used for analysis as ~75% values are missing. Due to this high number,
even imputation is not a good option.
o From the distribution, it looks like values < 90 are missing from the data due to some
kind of data capture issue.
4. Distance variable is left skewed but can be approximated to a normal distribution.
QUESTIONS:
1. Is there any alternate data available to compensate for the variable air speed ?
2. Is it okay to use transformations to convert a skewed distribution to a normal distribution ?
3. Should we do a univariate analysis BY Aircraft type as well ?
12. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Chapter – 2
DESCRIPTIVE STUDY
OBJECTIVE :
The objective of this exercise is to explore X-Y plots and identify the need for any transformations. It also
includes doing multivariate analysis on variables to determine correlation.
PROCEDURE :
Data visualizations involve two steps :
1) Creating X- Y Plots:
Distance vs Duration: Duration is randomly distributed vs Distance.
Distance vs Number of Passengers: Number of passengers is randomly distributed vs Distance
13. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Distance vs Ground Speed: There seems to be a trend in Ground Speed vs Distance plot. It can be an
exponential or squared trend and hence a transformation (preferably a squared one can be taken to
make it linear).
Distance vs Air Speed: The plot of Distance vs Air Speed is highly linear.
14. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Distance vs Height: The plot of Distance vs Height looks randomly distributed.
15. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Distance vs Pitch: The plot of Distance vs Pitch is randomly distributed.
Distance vs Ground Speed2
: The plot of Distance vs Ground Speed2
is highly linear.
16. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Overlay Plot :
Distance*Speed_Ground / Distance*Speed_Air Overlay: Air Speed appears to be falling within the
distribution of ground speed. Hence, we can exclude air speed from our regression model. However, we
can double check this using correlation of variables.
2) Correlation Matrix:
17. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Observations from Correlation Matrix:
1) Ground Speed and Air Speed are highly correlated to the target variable Distance.
2) Ground Speed and Air Speed are also highly correlated to each other. Hence, we should exclude Air
Speed from our model as it has only 195 valid observations (only 25%) and its effect is taken care of by
variable Ground Speed.
3) All other variables in the correlation matrix are independent to each other.
18. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Chapter – 3
STATISTICAL MODELING
OBJECTIVE :
The objective of this exercise is to perform modeling on the given data and get model parameters. These
parameters will then help in creating an equation for linear regression.
PROCEDURE :
Data modeling including preparing data for modeling as well as building the model. This is followed by
checking model diagnostics.
1) Data Preparation for modeling:
Dummitizing variable ‘AIRCRAFT’ – Since, SAS does not automatically dummitize categorical variables,
we will need to do this manually by creating a column for AIRCRAFT CODE having value 1 for “Airbus”
and 0 for “Boeing” (Variable Transformation – 1).
2) Modeling:
Baseline Model : Using all variables
20. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Inferences:
1) Duration, number of passengers and pitch are not significant. Only Aircraft Code, Ground Speed and
Height are important (Air Speed had already been pulled out of consideration for modeling).
2) The overall residuals for Distance follow a curve and are not homoscedastic which is a pre-requisite
condition for linear regression. Hence, we can use a transformation to make it homoscedastic.
Redefining Model:
1) Variable Transformation for re-modeling:
21. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
2) Re-modeling:
3) Refining the model based on p-values:
22. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Model Interpretation:
We can summarize the coefficient estimates of different variables in the model as under:
Factor Coefficient
Aircraft Code -401.95385
Ground Speed -69.01536
Square of Ground Speed 0.69220
23. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
We can now build the equation for linear regression using these estimates. Given the intercept of
2187.5, the linear regression equation is:
Distance = 2187.5 + (-401.95) * Aircraft Code + (-69) * Ground Speed + 0.69 * Square of
Ground Speed + 13.66 * Height
The equation can be interpreted in terms of different variables as:
Aircraft Code: For change in aircraft from boring to airbus, the distance estimate reduces by 401.95
meters.
Ground Speed: For 1 unit change in ground speed, distance changes by -69 metres due to linear value
but for every unit change in square value of ground speed, distance changes by 0.69 meters.
Height: For 1 unit change in heightm the distance changes by 13.66 metres.
3) Model Diagnostics:
Checking for following assumptions on the residuals:
1) Independent 2) Normally Distributed 3) Mean = 0 4) Constant Variance
Inference:
From above we can see that the residuals follow all the assumptions and their normality can be proved
through p<0.01 in Kolmogorov-Smirnov, Cramer-con Mises and Anderson Darling tests above. Their
constant variance can be seen from Residuals plot in the model.
Height 13.66
24. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Write Short answers to questions:
1) How many observations (flights) do you use to fit your final model? If not all 950 flights, why?
Ans: I used a total of 780 observations in my final model. Remaining 150 got removed as a part of the
cleaning process below:
S.No. Action Rows Removed Remaining
1 De-duplicating the combination of two datasets 100 850
2 30 <= Ground Speed <= 140 3 847
3 30 <= Air Speed <= 140 1(common with above) 847
4 Duration >40 and duration is not missing 55 792
5 Height >= 6 10 782
6 100 <= Distance <= 6000 2 780
2) What factors and how they impact the landing distance of a flight?
Ans: The effect of all factors on impact of Distance is given by the following equation:
Distance = 2187.5 + (-401.95) * Aircraft Code + (-69) * Ground Speed + 0.69 * Square of
Ground Speed + 13.66 * Height
Number of passengers, duration and pitch do not significantly affect the model as can be seem from
their X-Y plots as well as linear regression p-values.
Aircraft code, Ground Speed, Square of Ground Speed and Height significantly affect the model and
98% of the variance is explained by these factors.
Aircraft code has a step effect whereas height as a positive liner relationship with distance. Ground
speed affects distance in a combination of linear form and squared form.
The relationship of significant variables can also be seen from their X-Y plots and p-values in the model.
3) Is there any difference between the two makes Boeing and Airbus?
Ans: Doing Regression by Variable ‘Aircraft’
For aircraft = airbus: For aircraft = boeing :
25. STATISTICAL COMPUTING PROJECT – SECTION 003 PRERIT SAXENA (M12366923)
Observation : Though the RMSE is same for both, there is a large difference in coefficient of pitch as well
as Intercept.
Further exploring by T-Test on Distance and Pitch:
Distance:
Since, the variances are equal, through Pooled method, null hypothesis is rejected, so there is a
significant difference between landing distances of two types of aircrafts.
Pitch:
Since, the variances are unequal, through Satterthwaite method, null hypothesis is rejected, so there is a
significant difference between pitch of two types of aircrafts.