SlideShare a Scribd company logo
1 of 17
Download to read offline
STATISTICAL COMPUTING PROJECT – BANA 6043
NAME : POORVI DESHPANDE
UCID: M12388313
ABSTRACT
In this project, we are trying to determine what factors affect the landing distance of a commercial
flight and how they would impact the landing distance. We have been given two data sets
consisting of 950 flight observations combined.
To identify the factors and the magnitude by which they affect landing distance, we have used a
linear regression model wherein the target variable is landing distance (distance) and the rest serve
as predictor variables.
We follow various steps to reach an equation that describes our model. The steps include data
preparation, data exploration and data modelling. The correlation between the target and the
explanatory variables are calculated and a set of variables are chosen which have significant impact
on landing distance.
Landing distance is dependent on the type of aircraft, ground speed and height of the aircraft.
Distance =-2554.47 + 501.57(aircraft_cat)+ 42.79(speed_ground)+ 12.52(height)
*Aircraft_cat is 0 for Boeing and 1 for Airbus
CHAPTER 1 : DATA PREPARATION
Data preparation is done so as to obtain a clean data set for further analysis and accurate statistics.
The data needs to checked and filtered according to the acceptable conditions defined in the
problem statement.
STEPS
1. Combining data sets
We had 2 data sets with us. It serves better to combine the two and make out common inferences
about the combined dataset as all the columns were column save one.
PROC IMPORT
DATAFILE="/home/deshpapi0/Landing/FAA1.xls"
OUT=FAA1
DBMS=xls REPLACE;
RUN;
PROC IMPORT
DATAFILE="/home/deshpapi0/Landing/FAA2.xls"
OUT=FAA2
DBMS=xls REPLACE;
RUN;
DATA COMBINED;
SET FAA1 FAA2;
RUN;
2. Fetching basic details about the combined data set.
PROC MEANS DATA=combined;
RUN;
PROC UNIVARIATE DATA=COMBINED;
VAR speed_air;
HISTOGRAM speed_air;
PROC UNIVARIATE DATA=COMBINED;
VAR height;
HISTOGRAM height;
PROC UNIVARIATE DATA=COMBINED;
VAR pitch;
HISTOGRAM pitch;
PROC UNIVARIATE DATA=COMBINED;
VAR distance;
HISTOGRAM distance;
PROC UNIVARIATE DATA=COMBINED;
VAR duration;
HISTOGRAM duration;
HISTOGRAMS OF VARIABLES
We observe that other than speed_air and distance, other variables have a normal (or close to
normal) distribution.
3. Check for duplicate values
PROC SORT data=COMBINED NODUPKEY;
BY aircraft speed_ground no_pasg speed_air height pitch distance;
RUN;
We had 100 duplicate rows. These duplicate rows are deleted.
4. Checking for missing values and treating them
proc means data=COMBINED NMISS N;
run;
We find that there are 50 missing values for the variable ‘duration’ and 642 missing values for
‘speed_air’. At this stage we cannot go ahead and delete these missing values because we do not
know how significantly they affect the target variable. Also there could be outliers in these missing
values which could change the statistics of the data considerably.
5. Categorizing data
A flight is marked as normal or abnormal based on a number of criteria.
Another dataset has been created on which I have applied transformations. Since we limit our
model to the normal observations only, we can delete the abnormal observations.
According to the conditions given,
1. Duration: The duration of a normal flight should always be greater than 40min.
Deleting all flights with flight duration less than 40.
2. Speed_ground: If its value is less than 30MPH or greater than 140MPH, then the landing
would be considered as abnormal. Deleting all abnormal speed_ground.
3. Height: The landing aircraft is required to be at least 6 meters high at the threshold of the
runway. So, eight < 6 meters is abnormal. . Deleting all rows with height<6.
4. Speed_air (in miles per hour): The air speed of an aircraft when passing over the threshold of
the runway. If its value is less than 30MPH or greater than 140MPH, then the landing would be
considered as abnormal.
NOTE: Missing values are counted as Normal for now.
/*deleting abnormal flights*/
DATA FLIGHT_DATA;
SET COMBINED;
IF (duration<40 AND duration ^= '.') OR Speed_ground<30 OR Speed_ground>140
OR (speed_air<30 AND speed_air ^='.') OR speed_air>140 OR height<6 THEN
DELETE;
RUN;
proc print data= flight_data;
run;
.
.
.
6. Fetching statistics about the clean data set
PROC MEANS DATA=FLIGHT_DATA;
RUN;
HISTOGRAMS OF VARIABLE IN THE CLEANED DATA DET : FLIGHT_DATA
CHAPTER 2 : DATA EXPLORATION
Data exploration is done so as to statistically analyze the clean data for further regression
modelling. This step encompasses visualizing the spread of data, checking for linearity and to see
if there exists a correlation between the target and predictor variables. The variables which do not
have any effect on the target variable can be eliminated.
Steps:
1. It is advised to plot the data before modelling as it gives an estimate of the linear correlation
between variables. If there is a linear correlation, the plot turns out to be a straight line (or
close to a straight line). Otherwise we witness a scattered plot where in no linear
relationship can be determined.
PROC PLOT DATA= FLIGHT_DATA;
PLOT distance * (duration no_pasg speed_ground speed_air height pitch);
RUN;
PLOTS:
distance * duration
distance * no_pasg;
distance * speed_ground;
distance * speed_air;
distance * height;
distance * pitch;
We observe that speed_ground and speed_air are in linear correlation with distance. But by how
much? We need to find the magnitude of correlation. We obtain that objective by finding
coefficients of correlation.
2. Finding correlation coefficients
Before finding coefficient of correlation, we need to transform ‘aircraft’ which is a categorical
variable into a numerical one. We do this by creating dummy variables.
/*dummy variables for aircraft */
DATA FLIGHT_DATA;
SET flight_data;
IF (aircraft= "boeing") then aircraft_cat = 1;
else aircraft_cat = 0;
RUN;
This creates another column aircraft_cat and populates it with 1 for airbus and 0 for boeing. This
doesn’t affect our result in any way but also lets us take the make of aircraft into consideration.
Now, a correlation matrix is created to determine the magnitude of correlation between the
variables. Since this also gives us the correlation between independent variables, we can also
determine if any other variable is dependent on other variables.
proc corr data=flight_data;
var distance aircraft_cat duration no_pasg speed_ground speed_air height pitch;
title Pairwise correlation coefficients;
run;
Conclusions:
a) The variable distance is highly correlated with speed_ground and speed_air. Also,
distance is not correlated with duration and no_pasg as it has p values less than 0.05.
b) Therefore, these two variables no_pasg and duration play no role in determining the
regression model for the given data.
c) We observe that speed_air and speed_ground are in high correlation with each other.
Thus, we can drop one of the variables among the two. Since speed_air has a significant
amount of missing values, it makes sense to drop that column altogether.
/* drop column speed_air */
data FLIGHT_DATA (drop=speed_air) ;
set FLIGHT_DATA;
PROC PRINT DATA=FLIGHT_DATA;
RUN;
CHAPTER 3 : DATA MODELLING
Data modelling is done to obtain an equation that explains the dependence of the target variables
over the independent variables chosen through data exploration. In this model, we will focus on
how other factors are affecting landing distance through regression.
A simple linear equation can be defined as 𝑦 = 𝛼 + 𝛽𝑥 + 𝜀
Where 𝛼 is the intercept, 𝛽 is the parameter estimate for variable x and 𝜀 is error.
/* regression */
PROC REG data=flight_data;
MODEL distance = aircraft_cat duration speed_ground height pitch / r spec;
output out= FLIGHT_FINAL r= residual; run;
Since the null hypothesis (variable is significant in the equation) can be rejected for duration (p
value = 0.9097) and pitch (p value = 0.4561), we can simply remove these variables out of the
equation.
Now we do a regression on the remaining variables : Intercept, speed_ground, aircraft_cat, height.
PROC REG data=flight_data;
MODEL distance = aircraft_cat speed_ground height / r spec;
output out= FLIGHT_FINAL r= residual;
run;
After another regression test, we are not able to reject any of the other variables based on p-value.
Therefore, this is our final regression model.
α = -2554.46892
β1 = 501.57254 for x1 = aircraft_cat
β2 = 42.78669 for x2 = speed_ground
β3 = 14.52014 for x3 = height
Distance =-2554.47 + 501.57(aircraft_cat)+ 42.79(speed_ground)+ 12.52(height)
Questions:
1. How many observations (flights) do you use to fit your final model? If not all 950 flights, why?
In this model, 832 observations have been used after removing all the abnormal values and
duplicates. These are outlier values and do not comply with the conditions given as acceptable
flight landings.
2. What factors and how they impact the landing distance of a flight?
Landing distance is dependent on the type of aircraft, ground speed and height of the
aircraft.
Distance =-2554.47 + 501.57(aircraft_cat)+ 42.79(speed_ground)+ 12.52(height)
Aircraft_cat is dummy variable created to distinguish the categories of aircraft namely 0 as
airbus and 1 for boeing. It does not change anything in the results.
3. Is there any difference between the two makes Boeing and Airbus?
Yes. Since aircraft category is a determinant variable impacting the landing distance. It goes to
say that for different aircraft types, we will observe different values of landing distances.
Typically, here.. we have 444 observations with aircraft type as airbus and 388 observations with
aircraft type as boeing. T

More Related Content

What's hot

Robust reachability analysis NASA
Robust reachability analysis NASARobust reachability analysis NASA
Robust reachability analysis NASAM Reza Rahmati
 
j2 Universal - Modelling and Tuning Braking Characteristics
j2 Universal  - Modelling and Tuning Braking Characteristicsj2 Universal  - Modelling and Tuning Braking Characteristics
j2 Universal - Modelling and Tuning Braking CharacteristicsJohn Jeffery
 
Listing for MyStringFunctions
Listing for MyStringFunctionsListing for MyStringFunctions
Listing for MyStringFunctionsDerek Dhammaloka
 
Algorithm to count number of disjoint paths
Algorithm to count number of disjoint pathsAlgorithm to count number of disjoint paths
Algorithm to count number of disjoint pathsSujith Jay Nair
 
Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SASguest2160992
 
SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)Logan Palanisamy
 
Create a correlation plot from joined tables and lag times
Create a correlation plot from joined tables and lag timesCreate a correlation plot from joined tables and lag times
Create a correlation plot from joined tables and lag timesDougLoqa
 

What's hot (18)

Spatial SQL
Spatial SQLSpatial SQL
Spatial SQL
 
Mysql1
Mysql1Mysql1
Mysql1
 
ISR
ISRISR
ISR
 
Robust reachability analysis NASA
Robust reachability analysis NASARobust reachability analysis NASA
Robust reachability analysis NASA
 
j2 Universal - Modelling and Tuning Braking Characteristics
j2 Universal  - Modelling and Tuning Braking Characteristicsj2 Universal  - Modelling and Tuning Braking Characteristics
j2 Universal - Modelling and Tuning Braking Characteristics
 
Sas Plots Graphs
Sas Plots GraphsSas Plots Graphs
Sas Plots Graphs
 
JGrass-NewAge water budget
JGrass-NewAge water budget JGrass-NewAge water budget
JGrass-NewAge water budget
 
Listing for MyStringFunctions
Listing for MyStringFunctionsListing for MyStringFunctions
Listing for MyStringFunctions
 
Flight Control System
Flight Control SystemFlight Control System
Flight Control System
 
Route maps
Route mapsRoute maps
Route maps
 
Computational Assignment Help
Computational Assignment HelpComputational Assignment Help
Computational Assignment Help
 
Algorithm to count number of disjoint paths
Algorithm to count number of disjoint pathsAlgorithm to count number of disjoint paths
Algorithm to count number of disjoint paths
 
JGrass-NewAge ET component
 JGrass-NewAge ET component JGrass-NewAge ET component
JGrass-NewAge ET component
 
PLSQL Practices
PLSQL PracticesPLSQL Practices
PLSQL Practices
 
Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SAS
 
Topological sort
Topological sortTopological sort
Topological sort
 
SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)
 
Create a correlation plot from joined tables and lag times
Create a correlation plot from joined tables and lag timesCreate a correlation plot from joined tables and lag times
Create a correlation plot from joined tables and lag times
 

Similar to Flight landing Project

Modeling and Prediction using SAS
Modeling and Prediction using SASModeling and Prediction using SAS
Modeling and Prediction using SASJatin Saini
 
Predicting landing distance: Adrian Valles
Predicting landing distance: Adrian VallesPredicting landing distance: Adrian Valles
Predicting landing distance: Adrian VallesAdrián Vallés
 
Statistical computing project
Statistical computing projectStatistical computing project
Statistical computing projectRashmiSubrahmanya
 
Stats computing project_final
Stats computing project_finalStats computing project_final
Stats computing project_finalAyank Gupta
 
FAA Flight Landing Distance Forecasting and Analysis
FAA Flight Landing Distance Forecasting and AnalysisFAA Flight Landing Distance Forecasting and Analysis
FAA Flight Landing Distance Forecasting and AnalysisQuynh Tran
 
Flight departure delay prediction
Flight departure delay predictionFlight departure delay prediction
Flight departure delay predictionVivek Maskara
 
Conceptual Design of a Light Sport Aircraft
Conceptual Design of a Light Sport AircraftConceptual Design of a Light Sport Aircraft
Conceptual Design of a Light Sport AircraftDustan Gregory
 
Trajectory Generation for FLS Functionality Validation
Trajectory Generation for FLS Functionality Validation Trajectory Generation for FLS Functionality Validation
Trajectory Generation for FLS Functionality Validation Priyasloka Arya
 
3DoF Helicopter Trim , Deceleration manouver simulation, Stability
3DoF Helicopter Trim , Deceleration manouver simulation, Stability 3DoF Helicopter Trim , Deceleration manouver simulation, Stability
3DoF Helicopter Trim , Deceleration manouver simulation, Stability Deepak Paul Tirkey
 
Optimized Multi model Fuzzy Altitude and Translational Velocity Controller fo...
Optimized Multi model Fuzzy Altitude and Translational Velocity Controller fo...Optimized Multi model Fuzzy Altitude and Translational Velocity Controller fo...
Optimized Multi model Fuzzy Altitude and Translational Velocity Controller fo...Abimbola Ogundipe
 
Analyzing Air Quality Measurements in Macedonia with Apache Drill
Analyzing Air Quality Measurements in Macedonia with Apache DrillAnalyzing Air Quality Measurements in Macedonia with Apache Drill
Analyzing Air Quality Measurements in Macedonia with Apache DrillMarjan Sterjev
 
Prediction of Airlines Delay
Prediction of Airlines Delay Prediction of Airlines Delay
Prediction of Airlines Delay Dinesh Kommireddi
 
Using PostgreSQL for Flight Planning
Using PostgreSQL for Flight PlanningUsing PostgreSQL for Flight Planning
Using PostgreSQL for Flight PlanningBlake Crosby
 
30 5 Database Jdbc
30 5 Database Jdbc30 5 Database Jdbc
30 5 Database Jdbcphanleson
 
Interactive Session on Sparkling Water
Interactive Session on Sparkling WaterInteractive Session on Sparkling Water
Interactive Session on Sparkling WaterSri Ambati
 

Similar to Flight landing Project (20)

Modeling and Prediction using SAS
Modeling and Prediction using SASModeling and Prediction using SAS
Modeling and Prediction using SAS
 
Predicting landing distance: Adrian Valles
Predicting landing distance: Adrian VallesPredicting landing distance: Adrian Valles
Predicting landing distance: Adrian Valles
 
Statistical computing project
Statistical computing projectStatistical computing project
Statistical computing project
 
Flight Data Analysis
Flight Data AnalysisFlight Data Analysis
Flight Data Analysis
 
Flight Landing Risk Assessment Project
Flight Landing Risk Assessment ProjectFlight Landing Risk Assessment Project
Flight Landing Risk Assessment Project
 
Stats computing project_final
Stats computing project_finalStats computing project_final
Stats computing project_final
 
FAA Flight Landing Distance Forecasting and Analysis
FAA Flight Landing Distance Forecasting and AnalysisFAA Flight Landing Distance Forecasting and Analysis
FAA Flight Landing Distance Forecasting and Analysis
 
Flight departure delay prediction
Flight departure delay predictionFlight departure delay prediction
Flight departure delay prediction
 
Airline delay prediction
Airline delay predictionAirline delay prediction
Airline delay prediction
 
Conceptual Design of a Light Sport Aircraft
Conceptual Design of a Light Sport AircraftConceptual Design of a Light Sport Aircraft
Conceptual Design of a Light Sport Aircraft
 
Trajectory Generation for FLS Functionality Validation
Trajectory Generation for FLS Functionality Validation Trajectory Generation for FLS Functionality Validation
Trajectory Generation for FLS Functionality Validation
 
3DoF Helicopter Trim , Deceleration manouver simulation, Stability
3DoF Helicopter Trim , Deceleration manouver simulation, Stability 3DoF Helicopter Trim , Deceleration manouver simulation, Stability
3DoF Helicopter Trim , Deceleration manouver simulation, Stability
 
Optimized Multi model Fuzzy Altitude and Translational Velocity Controller fo...
Optimized Multi model Fuzzy Altitude and Translational Velocity Controller fo...Optimized Multi model Fuzzy Altitude and Translational Velocity Controller fo...
Optimized Multi model Fuzzy Altitude and Translational Velocity Controller fo...
 
Analyzing Air Quality Measurements in Macedonia with Apache Drill
Analyzing Air Quality Measurements in Macedonia with Apache DrillAnalyzing Air Quality Measurements in Macedonia with Apache Drill
Analyzing Air Quality Measurements in Macedonia with Apache Drill
 
Prediction of Airlines Delay
Prediction of Airlines Delay Prediction of Airlines Delay
Prediction of Airlines Delay
 
Using PostgreSQL for Flight Planning
Using PostgreSQL for Flight PlanningUsing PostgreSQL for Flight Planning
Using PostgreSQL for Flight Planning
 
30 5 Database Jdbc
30 5 Database Jdbc30 5 Database Jdbc
30 5 Database Jdbc
 
Interactive Session on Sparkling Water
Interactive Session on Sparkling WaterInteractive Session on Sparkling Water
Interactive Session on Sparkling Water
 
operator overloading
operator overloadingoperator overloading
operator overloading
 
Ch3
Ch3Ch3
Ch3
 

Recently uploaded

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 

Recently uploaded (20)

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

Flight landing Project

  • 1. STATISTICAL COMPUTING PROJECT – BANA 6043 NAME : POORVI DESHPANDE UCID: M12388313 ABSTRACT In this project, we are trying to determine what factors affect the landing distance of a commercial flight and how they would impact the landing distance. We have been given two data sets consisting of 950 flight observations combined. To identify the factors and the magnitude by which they affect landing distance, we have used a linear regression model wherein the target variable is landing distance (distance) and the rest serve as predictor variables. We follow various steps to reach an equation that describes our model. The steps include data preparation, data exploration and data modelling. The correlation between the target and the explanatory variables are calculated and a set of variables are chosen which have significant impact on landing distance. Landing distance is dependent on the type of aircraft, ground speed and height of the aircraft. Distance =-2554.47 + 501.57(aircraft_cat)+ 42.79(speed_ground)+ 12.52(height) *Aircraft_cat is 0 for Boeing and 1 for Airbus
  • 2. CHAPTER 1 : DATA PREPARATION Data preparation is done so as to obtain a clean data set for further analysis and accurate statistics. The data needs to checked and filtered according to the acceptable conditions defined in the problem statement. STEPS 1. Combining data sets We had 2 data sets with us. It serves better to combine the two and make out common inferences about the combined dataset as all the columns were column save one. PROC IMPORT DATAFILE="/home/deshpapi0/Landing/FAA1.xls" OUT=FAA1 DBMS=xls REPLACE; RUN; PROC IMPORT DATAFILE="/home/deshpapi0/Landing/FAA2.xls" OUT=FAA2 DBMS=xls REPLACE; RUN; DATA COMBINED; SET FAA1 FAA2; RUN; 2. Fetching basic details about the combined data set. PROC MEANS DATA=combined; RUN; PROC UNIVARIATE DATA=COMBINED; VAR speed_air; HISTOGRAM speed_air; PROC UNIVARIATE DATA=COMBINED; VAR height; HISTOGRAM height; PROC UNIVARIATE DATA=COMBINED; VAR pitch; HISTOGRAM pitch;
  • 3. PROC UNIVARIATE DATA=COMBINED; VAR distance; HISTOGRAM distance; PROC UNIVARIATE DATA=COMBINED; VAR duration; HISTOGRAM duration; HISTOGRAMS OF VARIABLES
  • 4. We observe that other than speed_air and distance, other variables have a normal (or close to normal) distribution. 3. Check for duplicate values PROC SORT data=COMBINED NODUPKEY; BY aircraft speed_ground no_pasg speed_air height pitch distance; RUN; We had 100 duplicate rows. These duplicate rows are deleted. 4. Checking for missing values and treating them proc means data=COMBINED NMISS N; run; We find that there are 50 missing values for the variable ‘duration’ and 642 missing values for ‘speed_air’. At this stage we cannot go ahead and delete these missing values because we do not know how significantly they affect the target variable. Also there could be outliers in these missing values which could change the statistics of the data considerably.
  • 5. 5. Categorizing data A flight is marked as normal or abnormal based on a number of criteria. Another dataset has been created on which I have applied transformations. Since we limit our model to the normal observations only, we can delete the abnormal observations. According to the conditions given, 1. Duration: The duration of a normal flight should always be greater than 40min. Deleting all flights with flight duration less than 40. 2. Speed_ground: If its value is less than 30MPH or greater than 140MPH, then the landing would be considered as abnormal. Deleting all abnormal speed_ground. 3. Height: The landing aircraft is required to be at least 6 meters high at the threshold of the runway. So, eight < 6 meters is abnormal. . Deleting all rows with height<6. 4. Speed_air (in miles per hour): The air speed of an aircraft when passing over the threshold of the runway. If its value is less than 30MPH or greater than 140MPH, then the landing would be considered as abnormal. NOTE: Missing values are counted as Normal for now. /*deleting abnormal flights*/ DATA FLIGHT_DATA; SET COMBINED; IF (duration<40 AND duration ^= '.') OR Speed_ground<30 OR Speed_ground>140 OR (speed_air<30 AND speed_air ^='.') OR speed_air>140 OR height<6 THEN DELETE; RUN; proc print data= flight_data; run; . .
  • 6. . 6. Fetching statistics about the clean data set PROC MEANS DATA=FLIGHT_DATA; RUN; HISTOGRAMS OF VARIABLE IN THE CLEANED DATA DET : FLIGHT_DATA
  • 7.
  • 8. CHAPTER 2 : DATA EXPLORATION Data exploration is done so as to statistically analyze the clean data for further regression modelling. This step encompasses visualizing the spread of data, checking for linearity and to see if there exists a correlation between the target and predictor variables. The variables which do not have any effect on the target variable can be eliminated. Steps: 1. It is advised to plot the data before modelling as it gives an estimate of the linear correlation between variables. If there is a linear correlation, the plot turns out to be a straight line (or close to a straight line). Otherwise we witness a scattered plot where in no linear relationship can be determined. PROC PLOT DATA= FLIGHT_DATA; PLOT distance * (duration no_pasg speed_ground speed_air height pitch); RUN; PLOTS: distance * duration
  • 11. distance * pitch; We observe that speed_ground and speed_air are in linear correlation with distance. But by how much? We need to find the magnitude of correlation. We obtain that objective by finding coefficients of correlation. 2. Finding correlation coefficients Before finding coefficient of correlation, we need to transform ‘aircraft’ which is a categorical variable into a numerical one. We do this by creating dummy variables. /*dummy variables for aircraft */ DATA FLIGHT_DATA; SET flight_data; IF (aircraft= "boeing") then aircraft_cat = 1; else aircraft_cat = 0; RUN; This creates another column aircraft_cat and populates it with 1 for airbus and 0 for boeing. This doesn’t affect our result in any way but also lets us take the make of aircraft into consideration. Now, a correlation matrix is created to determine the magnitude of correlation between the variables. Since this also gives us the correlation between independent variables, we can also determine if any other variable is dependent on other variables.
  • 12. proc corr data=flight_data; var distance aircraft_cat duration no_pasg speed_ground speed_air height pitch; title Pairwise correlation coefficients; run;
  • 13. Conclusions: a) The variable distance is highly correlated with speed_ground and speed_air. Also, distance is not correlated with duration and no_pasg as it has p values less than 0.05. b) Therefore, these two variables no_pasg and duration play no role in determining the regression model for the given data. c) We observe that speed_air and speed_ground are in high correlation with each other. Thus, we can drop one of the variables among the two. Since speed_air has a significant amount of missing values, it makes sense to drop that column altogether. /* drop column speed_air */ data FLIGHT_DATA (drop=speed_air) ; set FLIGHT_DATA; PROC PRINT DATA=FLIGHT_DATA; RUN; CHAPTER 3 : DATA MODELLING Data modelling is done to obtain an equation that explains the dependence of the target variables over the independent variables chosen through data exploration. In this model, we will focus on how other factors are affecting landing distance through regression. A simple linear equation can be defined as 𝑦 = 𝛼 + 𝛽𝑥 + 𝜀 Where 𝛼 is the intercept, 𝛽 is the parameter estimate for variable x and 𝜀 is error. /* regression */ PROC REG data=flight_data; MODEL distance = aircraft_cat duration speed_ground height pitch / r spec; output out= FLIGHT_FINAL r= residual; run;
  • 14. Since the null hypothesis (variable is significant in the equation) can be rejected for duration (p value = 0.9097) and pitch (p value = 0.4561), we can simply remove these variables out of the equation. Now we do a regression on the remaining variables : Intercept, speed_ground, aircraft_cat, height. PROC REG data=flight_data; MODEL distance = aircraft_cat speed_ground height / r spec; output out= FLIGHT_FINAL r= residual; run;
  • 15. After another regression test, we are not able to reject any of the other variables based on p-value. Therefore, this is our final regression model. α = -2554.46892 β1 = 501.57254 for x1 = aircraft_cat β2 = 42.78669 for x2 = speed_ground β3 = 14.52014 for x3 = height Distance =-2554.47 + 501.57(aircraft_cat)+ 42.79(speed_ground)+ 12.52(height)
  • 16. Questions: 1. How many observations (flights) do you use to fit your final model? If not all 950 flights, why? In this model, 832 observations have been used after removing all the abnormal values and duplicates. These are outlier values and do not comply with the conditions given as acceptable flight landings. 2. What factors and how they impact the landing distance of a flight? Landing distance is dependent on the type of aircraft, ground speed and height of the aircraft. Distance =-2554.47 + 501.57(aircraft_cat)+ 42.79(speed_ground)+ 12.52(height) Aircraft_cat is dummy variable created to distinguish the categories of aircraft namely 0 as airbus and 1 for boeing. It does not change anything in the results.
  • 17. 3. Is there any difference between the two makes Boeing and Airbus? Yes. Since aircraft category is a determinant variable impacting the landing distance. It goes to say that for different aircraft types, we will observe different values of landing distances. Typically, here.. we have 444 observations with aircraft type as airbus and 388 observations with aircraft type as boeing. T