SlideShare a Scribd company logo
Analysis and Prediction of Flight Price
Project By - Aditya Aryan
Overview
About
Business Value
Exploratory Data Analysis
Feature Engineering
Model
Results
Conclusion
About
Flight ticket prices can be something hard to guess, today we might see a price, check out
the price of the same flight tomorrow, and it will be a different story. We might have often
heard travelers saying that flight ticket prices are so unpredictable. As data scientists, we
are going to prove that given the correct data anything can be predicted. Here you will be
provided with prices of flight tickets for various airlines for the year 2019 and between
multiple cities. Size of training set: 10683 records. Our goal here is to create a machine-
learning model for predicting flight ticket prices.
Business Value
The flight ticket price in India is based on the demand and supply model with few
restrictions on pricing from regulatory bodies. It is often perceived as unpredictable and, a
recent dynamic pricing scheme added to the confusion. The objective is to create a
machine learning model for predicting the flight price, based on historical data, which can
be used for reference prices for customers as well as airline service providers
Exploratory Data Analysis
The sum of fare is maximum towards the month-March, which can be because of Holi
vacations in the month of March for schools/colleges, hence most families are also
generally going for vacations around this time. The count of flights is lowest in April as
schools, and colleges have exams and offices are busy as it is the end of quarter 1. From
May onwards, it starts to increase.
There are slight differences between each companies on this graph, Spice Jet and Truejet seems
to have the cheapest flights when Jet Airways Business and Vistara Premium are more expensive.
However it looks like Jet Airways business tickets are a far more expensive than the business
classes.
To visualize the price distribution for airlines, I created three price bins: economy (>5000 and
<10000), premium economy flights(>10000 and <=25000), and first-class (>25,000).
For economy, the price is heavily distributed
towards Indigo, Spice Jet, Air India, Jet Airways
For premium economy flights, the price is
heavily distributed towards Air India, Jet Airways
and Multiple Carriers.
Air India, Jet Airways, Jet Airways Business, and
Multiple Carriers offer a few first-class flights.
As we can see from the bar chart that the airfare price range in Delhi & New Delhi is the
maximum, this can be due to: Jet fuel prices in Delhi increased in the year 2018 by 26.4%, it is
also the National Capital, the political seat of power and a highly visited place for vacations(same
for Bangalore & cochin)The same reasoning can be given for higher price range in Delhi as the
source of the flight.
As a direct/non-stop flight is accounting for the fare of only one flight for a trip, its average
fare is the least. As the no. of stops/layovers increase, the fare price goes up accounting for
no. of flights and due to other resources being used up for the same.
We know that duration plays a major role in affecting air ticket prices but we see no such
pattern here, as there must be there are other significant factors affecting airfare like the
type of airline, the destination of the flight, date of the journey of flight(higher if collides
with a public holiday)
Morning flights are always cheaper and so are midnight flight prices. Evening flights and
Early Morning flight fares are expensive due to more demand and are the most convenient
time to travel for most people.
Within the exploratory data analysis step, we cleaned the dataset by removing the
duplicate values and null values. The next step is data pre-processing where we observed
that almost all of the information was present in string format. Data from each feature is
extracted i.e., day and month is extracted from the date of the journey in integer format,
and hours and minutes are extracted from the time of departure. Features like source and
destination need to be converted into values as they were of categorical type. For this
One Hot Encoding and Label Encoding techniques are used to convert categorical data.
Flight Prices are affected by holidays and the distance. I downloaded India holiday data
from Kaggle and for Latitude and Longitude, I used the Indian Cities Database from Kaggle
and calculated the distance between two cities using the Haversine formula. After that, I
apply cosine and sine transformation for cyclical features like an hour, month and day,
week, quarter, and minute. Then I calculated the Variance Inflation factor using to find
multi-collinearity between variables. There were multiple columns which shows multi-
collinearity. I used OLS to check the relation between the variables and their impact on
the target variable. I didn't drop any multi-collinear columns as the linear model
performed better with those features.
Feature Engineering
I experimented with the basic algorithms and found Extra Tree Regressor, and Random
Forest Regressor to give the lowest root mean squared error. After that, I tried XGBoost
Regressor and Light Gradient Boosting Regressor. My experimentations revealed that the
LightGBM model seemed to perform well. Therefore, I went ahead with the said model to
test which of the hyperparameters gave the highest score as per the ground truth
Model Building and Evaluation
LightGBM
Train Set: Root mean squared error: 600.76, Mean absolute error: 399.28, R Squared:
0.982
Test Set: Root mean squared error: 1428.74, Mean absolute error: 723.85, R Squared:
0.907
Result
Conclusion
This project can result in saving money for inexperienced people by providing them
the information related to trends in flight prices and also giving them a predicted
value of the price which they use to decide whether to book a ticket now or later. On
working with different models, it was found that the LightGBM algorithm gives the
best prediction in predicting the output.

More Related Content

What's hot

House price prediction
House price predictionHouse price prediction
House price prediction
AdityaKumar1505
 
House Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning AlgorithmHouse Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning Algorithm
ijtsrd
 
Housing price prediction
Housing price predictionHousing price prediction
Housing price prediction
Abhimanyu Dwivedi
 
Flight Delay Prediction
Flight Delay PredictionFlight Delay Prediction
Flight Delay Prediction
Vyshak Srishylappa
 
Data analytics with python introductory
Data analytics with python introductoryData analytics with python introductory
Data analytics with python introductory
Abhimanyu Dwivedi
 
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Simplilearn
 
Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...
University Politehnica Bucharest
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
Viet-Trung TRAN
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is fun
Zhen Li
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Simplilearn
 
Grid based method & model based clustering method
Grid based method & model based clustering methodGrid based method & model based clustering method
Grid based method & model based clustering method
rajshreemuthiah
 
Random forest
Random forestRandom forest
Random forest
Ujjawal
 
Prediction of House Sales Price
Prediction of House Sales PricePrediction of House Sales Price
Prediction of House Sales Price
Anirvan Ghosh
 
Machine learning
Machine learning Machine learning
Machine learning
Saurabh Agrawal
 
Using prior knowledge to initialize the hypothesis,kbann
Using prior knowledge to initialize the hypothesis,kbannUsing prior knowledge to initialize the hypothesis,kbann
Using prior knowledge to initialize the hypothesis,kbann
swapnac12
 
ML_Unit_1_Part_C
ML_Unit_1_Part_CML_Unit_1_Part_C
ML_Unit_1_Part_C
Srimatre K
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Data mining & predictive analytics for US Airlines' performance
Data mining & predictive analytics for US Airlines' performanceData mining & predictive analytics for US Airlines' performance
Data mining & predictive analytics for US Airlines' performance
Akiso Yadav
 
House Price Prediction.pptx
House Price Prediction.pptxHouse Price Prediction.pptx
House Price Prediction.pptx
CodingWorld5
 
Text Classification
Text ClassificationText Classification
Text Classification
RAX Automation Suite
 

What's hot (20)

House price prediction
House price predictionHouse price prediction
House price prediction
 
House Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning AlgorithmHouse Price Estimates Based on Machine Learning Algorithm
House Price Estimates Based on Machine Learning Algorithm
 
Housing price prediction
Housing price predictionHousing price prediction
Housing price prediction
 
Flight Delay Prediction
Flight Delay PredictionFlight Delay Prediction
Flight Delay Prediction
 
Data analytics with python introductory
Data analytics with python introductoryData analytics with python introductory
Data analytics with python introductory
 
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
 
Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is fun
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
 
Grid based method & model based clustering method
Grid based method & model based clustering methodGrid based method & model based clustering method
Grid based method & model based clustering method
 
Random forest
Random forestRandom forest
Random forest
 
Prediction of House Sales Price
Prediction of House Sales PricePrediction of House Sales Price
Prediction of House Sales Price
 
Machine learning
Machine learning Machine learning
Machine learning
 
Using prior knowledge to initialize the hypothesis,kbann
Using prior knowledge to initialize the hypothesis,kbannUsing prior knowledge to initialize the hypothesis,kbann
Using prior knowledge to initialize the hypothesis,kbann
 
ML_Unit_1_Part_C
ML_Unit_1_Part_CML_Unit_1_Part_C
ML_Unit_1_Part_C
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Data mining & predictive analytics for US Airlines' performance
Data mining & predictive analytics for US Airlines' performanceData mining & predictive analytics for US Airlines' performance
Data mining & predictive analytics for US Airlines' performance
 
House Price Prediction.pptx
House Price Prediction.pptxHouse Price Prediction.pptx
House Price Prediction.pptx
 
Text Classification
Text ClassificationText Classification
Text Classification
 

Similar to Air Ticket Price Prediction.pdf

Airfare Price Prediction System
Airfare Price Prediction SystemAirfare Price Prediction System
Airfare Price Prediction System
IRJET Journal
 
Aircraft Ticket Price Prediction Using Machine Learning
Aircraft Ticket Price Prediction Using Machine LearningAircraft Ticket Price Prediction Using Machine Learning
Aircraft Ticket Price Prediction Using Machine Learning
Christine Williams
 
Can we predict Airline fares from London to cities in Asia
Can we predict Airline fares from London to cities in Asia Can we predict Airline fares from London to cities in Asia
Can we predict Airline fares from London to cities in Asia
Karim Awad
 
PRESENTATION ON CHALLENGE lab_084627 (1).pptx
PRESENTATION ON CHALLENGE lab_084627 (1).pptxPRESENTATION ON CHALLENGE lab_084627 (1).pptx
PRESENTATION ON CHALLENGE lab_084627 (1).pptx
MUSAIDRIS15
 
Airfare Analysis of Domestic Airlines in U.S.
Airfare Analysis of Domestic Airlines in U.S.Airfare Analysis of Domestic Airlines in U.S.
Airfare Analysis of Domestic Airlines in U.S.
ABHISHEKDAHALE
 
ANSI CPH Worksheet
ANSI CPH WorksheetANSI CPH Worksheet
ANSI CPH Worksheet
Beth Duvall-Fischer, HCS
 
Air Travel Analytics in SAS
Air Travel Analytics in SASAir Travel Analytics in SAS
Air Travel Analytics in SAS
Rohan Nanda
 
Air passenger report
Air passenger reportAir passenger report
Air passenger report
Subhadeep Chakraborty
 
Pricing Optimization using Machine Learning
Pricing Optimization using Machine LearningPricing Optimization using Machine Learning
Pricing Optimization using Machine Learning
IRJET Journal
 
Flight delay detection data mining project
Flight delay detection data mining projectFlight delay detection data mining project
Flight delay detection data mining project
Akshay Kumar Bhushan
 
Niraj on Cost Behaviour
Niraj on Cost BehaviourNiraj on Cost Behaviour
Niraj on Cost Behaviour
CA Niraj Thapa
 
Business Decision Making Part TwoQNT275.docx
Business Decision Making Part TwoQNT275.docxBusiness Decision Making Part TwoQNT275.docx
Business Decision Making Part TwoQNT275.docx
RAHUL126667
 
(Chelsey)Das Bett, Katharina Grosse, 2004, Acrylic on various obje
(Chelsey)Das Bett, Katharina Grosse, 2004, Acrylic on various obje(Chelsey)Das Bett, Katharina Grosse, 2004, Acrylic on various obje
(Chelsey)Das Bett, Katharina Grosse, 2004, Acrylic on various obje
MoseStaton39
 
Real Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine LearningReal Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine Learning
IRJET Journal
 
Aviation articles - Aircraft Evaluation and selection
Aviation articles - Aircraft Evaluation and selectionAviation articles - Aircraft Evaluation and selection
Aviation articles - Aircraft Evaluation and selection
Mohammed Hadi
 
Deriving insights from data using "R"ight way
Deriving insights from data using "R"ight wayDeriving insights from data using "R"ight way
Deriving insights from data using "R"ight way
Gaurav Shrivastav
 
Random Forest Ensemble learning algorithm for Engineering Analytics Project
Random Forest Ensemble learning algorithm for Engineering Analytics ProjectRandom Forest Ensemble learning algorithm for Engineering Analytics Project
Random Forest Ensemble learning algorithm for Engineering Analytics Project
Saurabh Kale
 
MS Access Database Project proposal on Airline Reservation System
MS Access Database Project proposal on Airline Reservation SystemMS Access Database Project proposal on Airline Reservation System
MS Access Database Project proposal on Airline Reservation System
Faisal Shahzad Khan
 
Can I Buy An Essay Now Oelbert Gymnasiumoelbert
Can I Buy An Essay Now Oelbert GymnasiumoelbertCan I Buy An Essay Now Oelbert Gymnasiumoelbert
Can I Buy An Essay Now Oelbert Gymnasiumoelbert
Angela Gibbs
 
Prediction of Car Price using Linear Regression
Prediction of Car Price using Linear RegressionPrediction of Car Price using Linear Regression
Prediction of Car Price using Linear Regression
ijtsrd
 

Similar to Air Ticket Price Prediction.pdf (20)

Airfare Price Prediction System
Airfare Price Prediction SystemAirfare Price Prediction System
Airfare Price Prediction System
 
Aircraft Ticket Price Prediction Using Machine Learning
Aircraft Ticket Price Prediction Using Machine LearningAircraft Ticket Price Prediction Using Machine Learning
Aircraft Ticket Price Prediction Using Machine Learning
 
Can we predict Airline fares from London to cities in Asia
Can we predict Airline fares from London to cities in Asia Can we predict Airline fares from London to cities in Asia
Can we predict Airline fares from London to cities in Asia
 
PRESENTATION ON CHALLENGE lab_084627 (1).pptx
PRESENTATION ON CHALLENGE lab_084627 (1).pptxPRESENTATION ON CHALLENGE lab_084627 (1).pptx
PRESENTATION ON CHALLENGE lab_084627 (1).pptx
 
Airfare Analysis of Domestic Airlines in U.S.
Airfare Analysis of Domestic Airlines in U.S.Airfare Analysis of Domestic Airlines in U.S.
Airfare Analysis of Domestic Airlines in U.S.
 
ANSI CPH Worksheet
ANSI CPH WorksheetANSI CPH Worksheet
ANSI CPH Worksheet
 
Air Travel Analytics in SAS
Air Travel Analytics in SASAir Travel Analytics in SAS
Air Travel Analytics in SAS
 
Air passenger report
Air passenger reportAir passenger report
Air passenger report
 
Pricing Optimization using Machine Learning
Pricing Optimization using Machine LearningPricing Optimization using Machine Learning
Pricing Optimization using Machine Learning
 
Flight delay detection data mining project
Flight delay detection data mining projectFlight delay detection data mining project
Flight delay detection data mining project
 
Niraj on Cost Behaviour
Niraj on Cost BehaviourNiraj on Cost Behaviour
Niraj on Cost Behaviour
 
Business Decision Making Part TwoQNT275.docx
Business Decision Making Part TwoQNT275.docxBusiness Decision Making Part TwoQNT275.docx
Business Decision Making Part TwoQNT275.docx
 
(Chelsey)Das Bett, Katharina Grosse, 2004, Acrylic on various obje
(Chelsey)Das Bett, Katharina Grosse, 2004, Acrylic on various obje(Chelsey)Das Bett, Katharina Grosse, 2004, Acrylic on various obje
(Chelsey)Das Bett, Katharina Grosse, 2004, Acrylic on various obje
 
Real Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine LearningReal Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine Learning
 
Aviation articles - Aircraft Evaluation and selection
Aviation articles - Aircraft Evaluation and selectionAviation articles - Aircraft Evaluation and selection
Aviation articles - Aircraft Evaluation and selection
 
Deriving insights from data using "R"ight way
Deriving insights from data using "R"ight wayDeriving insights from data using "R"ight way
Deriving insights from data using "R"ight way
 
Random Forest Ensemble learning algorithm for Engineering Analytics Project
Random Forest Ensemble learning algorithm for Engineering Analytics ProjectRandom Forest Ensemble learning algorithm for Engineering Analytics Project
Random Forest Ensemble learning algorithm for Engineering Analytics Project
 
MS Access Database Project proposal on Airline Reservation System
MS Access Database Project proposal on Airline Reservation SystemMS Access Database Project proposal on Airline Reservation System
MS Access Database Project proposal on Airline Reservation System
 
Can I Buy An Essay Now Oelbert Gymnasiumoelbert
Can I Buy An Essay Now Oelbert GymnasiumoelbertCan I Buy An Essay Now Oelbert Gymnasiumoelbert
Can I Buy An Essay Now Oelbert Gymnasiumoelbert
 
Prediction of Car Price using Linear Regression
Prediction of Car Price using Linear RegressionPrediction of Car Price using Linear Regression
Prediction of Car Price using Linear Regression
 

Recently uploaded

CAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdfCAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdf
frp60658
 
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
22ad0301
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service LucknowCall Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
hiju9823
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
GeorgiiSteshenko
 
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance PaymentCall Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
prijesh mathew
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdfsaps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
newdirectionconsulta
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
actyx
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
PsychoTech Services
 

Recently uploaded (20)

CAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdfCAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdf
 
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service LucknowCall Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
 
Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)Telemetry Solution for Gaming (AWS Summit'24)
Telemetry Solution for Gaming (AWS Summit'24)
 
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance PaymentCall Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
Call Girls Hyderabad ❤️ 7339748667 ❤️ With No Advance Payment
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdfsaps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
Interview Methods - Marital and Family Therapy and Counselling - Psychology S...
 

Air Ticket Price Prediction.pdf

  • 1. Analysis and Prediction of Flight Price Project By - Aditya Aryan
  • 2. Overview About Business Value Exploratory Data Analysis Feature Engineering Model Results Conclusion
  • 3. About Flight ticket prices can be something hard to guess, today we might see a price, check out the price of the same flight tomorrow, and it will be a different story. We might have often heard travelers saying that flight ticket prices are so unpredictable. As data scientists, we are going to prove that given the correct data anything can be predicted. Here you will be provided with prices of flight tickets for various airlines for the year 2019 and between multiple cities. Size of training set: 10683 records. Our goal here is to create a machine- learning model for predicting flight ticket prices.
  • 4. Business Value The flight ticket price in India is based on the demand and supply model with few restrictions on pricing from regulatory bodies. It is often perceived as unpredictable and, a recent dynamic pricing scheme added to the confusion. The objective is to create a machine learning model for predicting the flight price, based on historical data, which can be used for reference prices for customers as well as airline service providers
  • 5. Exploratory Data Analysis The sum of fare is maximum towards the month-March, which can be because of Holi vacations in the month of March for schools/colleges, hence most families are also generally going for vacations around this time. The count of flights is lowest in April as schools, and colleges have exams and offices are busy as it is the end of quarter 1. From May onwards, it starts to increase.
  • 6. There are slight differences between each companies on this graph, Spice Jet and Truejet seems to have the cheapest flights when Jet Airways Business and Vistara Premium are more expensive. However it looks like Jet Airways business tickets are a far more expensive than the business classes.
  • 7. To visualize the price distribution for airlines, I created three price bins: economy (>5000 and <10000), premium economy flights(>10000 and <=25000), and first-class (>25,000). For economy, the price is heavily distributed towards Indigo, Spice Jet, Air India, Jet Airways For premium economy flights, the price is heavily distributed towards Air India, Jet Airways and Multiple Carriers. Air India, Jet Airways, Jet Airways Business, and Multiple Carriers offer a few first-class flights.
  • 8. As we can see from the bar chart that the airfare price range in Delhi & New Delhi is the maximum, this can be due to: Jet fuel prices in Delhi increased in the year 2018 by 26.4%, it is also the National Capital, the political seat of power and a highly visited place for vacations(same for Bangalore & cochin)The same reasoning can be given for higher price range in Delhi as the source of the flight.
  • 9. As a direct/non-stop flight is accounting for the fare of only one flight for a trip, its average fare is the least. As the no. of stops/layovers increase, the fare price goes up accounting for no. of flights and due to other resources being used up for the same.
  • 10. We know that duration plays a major role in affecting air ticket prices but we see no such pattern here, as there must be there are other significant factors affecting airfare like the type of airline, the destination of the flight, date of the journey of flight(higher if collides with a public holiday)
  • 11. Morning flights are always cheaper and so are midnight flight prices. Evening flights and Early Morning flight fares are expensive due to more demand and are the most convenient time to travel for most people.
  • 12. Within the exploratory data analysis step, we cleaned the dataset by removing the duplicate values and null values. The next step is data pre-processing where we observed that almost all of the information was present in string format. Data from each feature is extracted i.e., day and month is extracted from the date of the journey in integer format, and hours and minutes are extracted from the time of departure. Features like source and destination need to be converted into values as they were of categorical type. For this One Hot Encoding and Label Encoding techniques are used to convert categorical data. Flight Prices are affected by holidays and the distance. I downloaded India holiday data from Kaggle and for Latitude and Longitude, I used the Indian Cities Database from Kaggle and calculated the distance between two cities using the Haversine formula. After that, I apply cosine and sine transformation for cyclical features like an hour, month and day, week, quarter, and minute. Then I calculated the Variance Inflation factor using to find multi-collinearity between variables. There were multiple columns which shows multi- collinearity. I used OLS to check the relation between the variables and their impact on the target variable. I didn't drop any multi-collinear columns as the linear model performed better with those features. Feature Engineering
  • 13. I experimented with the basic algorithms and found Extra Tree Regressor, and Random Forest Regressor to give the lowest root mean squared error. After that, I tried XGBoost Regressor and Light Gradient Boosting Regressor. My experimentations revealed that the LightGBM model seemed to perform well. Therefore, I went ahead with the said model to test which of the hyperparameters gave the highest score as per the ground truth Model Building and Evaluation
  • 14. LightGBM Train Set: Root mean squared error: 600.76, Mean absolute error: 399.28, R Squared: 0.982 Test Set: Root mean squared error: 1428.74, Mean absolute error: 723.85, R Squared: 0.907 Result
  • 15. Conclusion This project can result in saving money for inexperienced people by providing them the information related to trends in flight prices and also giving them a predicted value of the price which they use to decide whether to book a ticket now or later. On working with different models, it was found that the LightGBM algorithm gives the best prediction in predicting the output.