SlideShare a Scribd company logo
1 of 10
To analyze Real-Time Flight data using Spark GraphX,
provide near real-time computation results and
visualize the results.
Case Study: Flight Data Analysis
using Spark GraphXm Statement
Vu Pham GraphX
Big Data Computing
Flight Data Analysis using Spark GraphX
Dataset Description:
The data has been collected from U.S. Department of
Transportation's (DOT) Bureau of Transportation
Statistics site which tracks the on-time performance of
domestic flights operated
Summary information on
by large air carriers. The
the number of on-time,
delayed, canceled, and diverted flights is published in
DOT's monthly Air Travel Consumer Report and in this
dataset of January 2014 flight delays and cancellations.
Vu Pham GraphX
Big Data Computing
Big Data Pipeline for Flight Data Analysis using
Spark GraphX
Vu Pham GraphX
Big Data Computing
UseCases
III.
I. Monitoring Air traffic at airports
II. Monitoring of flight delays
Analysis overall routes and airports
IV. Analysis of routes and airports per airline
Vu Pham GraphX
Big Data Computing
Objectives
Compute the total number of airports
Compute the total number of flight routes
Compute and sort the longest flight routes
Display the airports with the highest incoming flights
Display the airports with the highest outgoing flights
List the most important airports according to PageRank
List the routes with the lowest flight costs
Display the airport codes with the lowest flight costs
List the routes with airport codes which have the lowest
flight costs
Vu Pham GraphX
Big Data Computing
Features
17 Attributes
Attribute Name Attribute Description
dOfM Day of Month
dOfW Day of Week
carrier Unique Airline Carrier Code
tailNum Tail Number
fNum Flight Number Reporting Airline
origin_id Origin Airport Id
origin Origin Airport
dest_id Destination Airport Id
dest Destination Airport
crsdepttime CRS Departure Time (local time: hhmm)
deptime Actual Departure Time
Vu Pham GraphX
Big Data Computing
Features
Attribute Name Attribute Description
depdelaymins Difference in minutes between scheduled
and actual departure time. Early
departures set to 0.
crsarrtime CRS Arrival Time (local time: hhmm)
arrtime Actual Arrival Time (local time: hhmm)
arrdelaymins Difference in minutes between scheduled
and actual arrival time. Early arrivals set to
0.
crselapsedtime CRS Elapsed Time of Flight, in Minutes
dist Distance between airports (miles)
Vu Pham GraphX
Big Data Computing
Sample Dataset
Vu Pham GraphX
Big Data Computing
Graph Operations
Output the cheapest airfare routes
val gg = graph.mapEdges(e => 50.toDouble + e.attr.toDouble / 20)
//Call pregel on graph
val sssp = initialGraph.pregel(Double.PositiveInfinity)(
//vertex program
(id, distCost, newDistCost) => math.min(distCost, newDistCost),triplet => {
//send message
if(triplet.srcAttr + triplet.attr < triplet.dstAttr)
{
Iterator((triplet.dstId, triplet.srcAttr + triplet.attr))
}
else
{
Iterator.empty
}
},
//Merge Messages
(a,b) => math.min(a,b)
)
//print routes with lowest flight cost
print("routes with lowest flight cost")
println(sssp.edges.take(10).mkString("n"))
Vu Pham GraphX
Big Data Computing
The growing scale and importance of graph data has driven the
development of specialized graph computation engines
capable of inferring complex recursive properties of graph-
structured data.
In this lecture we have discussed GraphX, a distributed graph
processing framework that unifies graph-parallel and data-
parallel computation in a single system and is capable of
succinctly expressing and efficiently executing the entire graph
analytics pipeline.
Conclusion
Vu Pham GraphX
Big Data Computing

More Related Content

Similar to big data slides.pptx

European Airport PerformanceFramework
European Airport PerformanceFrameworkEuropean Airport PerformanceFramework
European Airport PerformanceFrameworkJLGChico
 
Benefit/Cost Analysis For Ait Traffic Control Towers Presentation
Benefit/Cost Analysis For Ait Traffic Control Towers PresentationBenefit/Cost Analysis For Ait Traffic Control Towers Presentation
Benefit/Cost Analysis For Ait Traffic Control Towers PresentationÜlger Ahmet
 
Prediction of Airlines Delay
Prediction of Airlines Delay Prediction of Airlines Delay
Prediction of Airlines Delay Dinesh Kommireddi
 
A Novel Approach To The Weight and Balance Calculation for The De Haviland Ca...
A Novel Approach To The Weight and Balance Calculation for The De Haviland Ca...A Novel Approach To The Weight and Balance Calculation for The De Haviland Ca...
A Novel Approach To The Weight and Balance Calculation for The De Haviland Ca...CSCJournals
 
Scheduling And Revenue Management Process
Scheduling And Revenue Management ProcessScheduling And Revenue Management Process
Scheduling And Revenue Management Processahmad bassiouny
 
FAMILIARIZATION WITH AVIONICS SUITE
FAMILIARIZATION WITH AVIONICS SUITE FAMILIARIZATION WITH AVIONICS SUITE
FAMILIARIZATION WITH AVIONICS SUITE MIbrar4
 
Simulation_Final_Project_Grp6.pptx
Simulation_Final_Project_Grp6.pptxSimulation_Final_Project_Grp6.pptx
Simulation_Final_Project_Grp6.pptxShivamPandey521834
 
Flow simulation - Casablanca Interrnational Airport
Flow simulation - Casablanca Interrnational AirportFlow simulation - Casablanca Interrnational Airport
Flow simulation - Casablanca Interrnational Airportnsihammou
 
Is Low Cost Carrier Profitable - Norwegian article - Issue No. 1
Is Low Cost Carrier Profitable  - Norwegian article - Issue No. 1Is Low Cost Carrier Profitable  - Norwegian article - Issue No. 1
Is Low Cost Carrier Profitable - Norwegian article - Issue No. 1Mohammed Awad
 
Mlat ads-b-reference-guide
Mlat ads-b-reference-guideMlat ads-b-reference-guide
Mlat ads-b-reference-guideSergio Llugdar
 
Is Low Cost Carrier Profitable -Ryan article - Issue No. 2
Is Low Cost Carrier Profitable -Ryan article - Issue No. 2Is Low Cost Carrier Profitable -Ryan article - Issue No. 2
Is Low Cost Carrier Profitable -Ryan article - Issue No. 2Mohammed Awad
 
Course project for CEE 4674
Course project for CEE 4674Course project for CEE 4674
Course project for CEE 4674Junqi Hu
 
Flight delay detection data mining project
Flight delay detection data mining projectFlight delay detection data mining project
Flight delay detection data mining projectAkshay Kumar Bhushan
 
SFDC Roll-Out v1c
SFDC Roll-Out v1cSFDC Roll-Out v1c
SFDC Roll-Out v1ctravisdow
 

Similar to big data slides.pptx (20)

European Airport PerformanceFramework
European Airport PerformanceFrameworkEuropean Airport PerformanceFramework
European Airport PerformanceFramework
 
Cost Monitoring Framework
Cost Monitoring FrameworkCost Monitoring Framework
Cost Monitoring Framework
 
Benefit/Cost Analysis For Ait Traffic Control Towers Presentation
Benefit/Cost Analysis For Ait Traffic Control Towers PresentationBenefit/Cost Analysis For Ait Traffic Control Towers Presentation
Benefit/Cost Analysis For Ait Traffic Control Towers Presentation
 
Prediction of Airlines Delay
Prediction of Airlines Delay Prediction of Airlines Delay
Prediction of Airlines Delay
 
Poster
PosterPoster
Poster
 
industrial
industrialindustrial
industrial
 
A Novel Approach To The Weight and Balance Calculation for The De Haviland Ca...
A Novel Approach To The Weight and Balance Calculation for The De Haviland Ca...A Novel Approach To The Weight and Balance Calculation for The De Haviland Ca...
A Novel Approach To The Weight and Balance Calculation for The De Haviland Ca...
 
Sergio Martins - OPERATIONS AND PASSENGERS WORKSHOP - PANEL 2
Sergio Martins - OPERATIONS AND PASSENGERS WORKSHOP - PANEL 2Sergio Martins - OPERATIONS AND PASSENGERS WORKSHOP - PANEL 2
Sergio Martins - OPERATIONS AND PASSENGERS WORKSHOP - PANEL 2
 
Scheduling And Revenue Management Process
Scheduling And Revenue Management ProcessScheduling And Revenue Management Process
Scheduling And Revenue Management Process
 
FAMILIARIZATION WITH AVIONICS SUITE
FAMILIARIZATION WITH AVIONICS SUITE FAMILIARIZATION WITH AVIONICS SUITE
FAMILIARIZATION WITH AVIONICS SUITE
 
Simulation_Final_Project_Grp6.pptx
Simulation_Final_Project_Grp6.pptxSimulation_Final_Project_Grp6.pptx
Simulation_Final_Project_Grp6.pptx
 
Flow simulation - Casablanca Interrnational Airport
Flow simulation - Casablanca Interrnational AirportFlow simulation - Casablanca Interrnational Airport
Flow simulation - Casablanca Interrnational Airport
 
Dart presentation 4
Dart presentation 4Dart presentation 4
Dart presentation 4
 
Is Low Cost Carrier Profitable - Norwegian article - Issue No. 1
Is Low Cost Carrier Profitable  - Norwegian article - Issue No. 1Is Low Cost Carrier Profitable  - Norwegian article - Issue No. 1
Is Low Cost Carrier Profitable - Norwegian article - Issue No. 1
 
Mlat ads-b-reference-guide
Mlat ads-b-reference-guideMlat ads-b-reference-guide
Mlat ads-b-reference-guide
 
Is Low Cost Carrier Profitable -Ryan article - Issue No. 2
Is Low Cost Carrier Profitable -Ryan article - Issue No. 2Is Low Cost Carrier Profitable -Ryan article - Issue No. 2
Is Low Cost Carrier Profitable -Ryan article - Issue No. 2
 
Overview of airline booking process
Overview of airline booking processOverview of airline booking process
Overview of airline booking process
 
Course project for CEE 4674
Course project for CEE 4674Course project for CEE 4674
Course project for CEE 4674
 
Flight delay detection data mining project
Flight delay detection data mining projectFlight delay detection data mining project
Flight delay detection data mining project
 
SFDC Roll-Out v1c
SFDC Roll-Out v1cSFDC Roll-Out v1c
SFDC Roll-Out v1c
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 

Recently uploaded (20)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 

big data slides.pptx

  • 1. To analyze Real-Time Flight data using Spark GraphX, provide near real-time computation results and visualize the results. Case Study: Flight Data Analysis using Spark GraphXm Statement Vu Pham GraphX Big Data Computing
  • 2. Flight Data Analysis using Spark GraphX Dataset Description: The data has been collected from U.S. Department of Transportation's (DOT) Bureau of Transportation Statistics site which tracks the on-time performance of domestic flights operated Summary information on by large air carriers. The the number of on-time, delayed, canceled, and diverted flights is published in DOT's monthly Air Travel Consumer Report and in this dataset of January 2014 flight delays and cancellations. Vu Pham GraphX Big Data Computing
  • 3. Big Data Pipeline for Flight Data Analysis using Spark GraphX Vu Pham GraphX Big Data Computing
  • 4. UseCases III. I. Monitoring Air traffic at airports II. Monitoring of flight delays Analysis overall routes and airports IV. Analysis of routes and airports per airline Vu Pham GraphX Big Data Computing
  • 5. Objectives Compute the total number of airports Compute the total number of flight routes Compute and sort the longest flight routes Display the airports with the highest incoming flights Display the airports with the highest outgoing flights List the most important airports according to PageRank List the routes with the lowest flight costs Display the airport codes with the lowest flight costs List the routes with airport codes which have the lowest flight costs Vu Pham GraphX Big Data Computing
  • 6. Features 17 Attributes Attribute Name Attribute Description dOfM Day of Month dOfW Day of Week carrier Unique Airline Carrier Code tailNum Tail Number fNum Flight Number Reporting Airline origin_id Origin Airport Id origin Origin Airport dest_id Destination Airport Id dest Destination Airport crsdepttime CRS Departure Time (local time: hhmm) deptime Actual Departure Time Vu Pham GraphX Big Data Computing
  • 7. Features Attribute Name Attribute Description depdelaymins Difference in minutes between scheduled and actual departure time. Early departures set to 0. crsarrtime CRS Arrival Time (local time: hhmm) arrtime Actual Arrival Time (local time: hhmm) arrdelaymins Difference in minutes between scheduled and actual arrival time. Early arrivals set to 0. crselapsedtime CRS Elapsed Time of Flight, in Minutes dist Distance between airports (miles) Vu Pham GraphX Big Data Computing
  • 8. Sample Dataset Vu Pham GraphX Big Data Computing
  • 9. Graph Operations Output the cheapest airfare routes val gg = graph.mapEdges(e => 50.toDouble + e.attr.toDouble / 20) //Call pregel on graph val sssp = initialGraph.pregel(Double.PositiveInfinity)( //vertex program (id, distCost, newDistCost) => math.min(distCost, newDistCost),triplet => { //send message if(triplet.srcAttr + triplet.attr < triplet.dstAttr) { Iterator((triplet.dstId, triplet.srcAttr + triplet.attr)) } else { Iterator.empty } }, //Merge Messages (a,b) => math.min(a,b) ) //print routes with lowest flight cost print("routes with lowest flight cost") println(sssp.edges.take(10).mkString("n")) Vu Pham GraphX Big Data Computing
  • 10. The growing scale and importance of graph data has driven the development of specialized graph computation engines capable of inferring complex recursive properties of graph- structured data. In this lecture we have discussed GraphX, a distributed graph processing framework that unifies graph-parallel and data- parallel computation in a single system and is capable of succinctly expressing and efficiently executing the entire graph analytics pipeline. Conclusion Vu Pham GraphX Big Data Computing