SlideShare a Scribd company logo
Hal im Abbas 
Rashid Al i 
Amanda 
Gi lber t 
LaTia Jef ferson 
THINK BIG BOOTCAMP 
PROJECT
DATA INGESTION
METHODOLOGY 
 Created Python MapReduce job to format the data for 
ingestion 
 Python dictionary to deal with paired data 
 Logic to ignore lines with data issues 
 Executed a Hadoop streaming job to ingest the data 
 Loaded the data into tables via hive 
 I n g e s te d FA A’ s A i rc r af t Re g i st r y d a t a 
 Re- ingested data by site
PRELIMINARY ANALYSIS
INITIAL EXPLORATION 
 Most frequently repor ting craf ts 
 Percentage of records accepted & ingested 
 99.80% 
 1924 lines rejected
SITE COMPARISON 
Site 1 Site 2 
Number of Sightings 563715 449904 
Average Speed 342.02 395.37 
Average Alt 15919.28 20295.33
SITE LATITUDE LONGITUDE DATA 
 Sampled latitude and longitude data both sites 
 Found average latitude and longitude for each 
 Site one: 42.22, -70.85 
 Site two: 42.12, -71.49
SITE 1 RELATIVE LOCATION
SITE 2 RELATIVE LOCATION
MASTER DATA QUERIES 
CREATE TABLE model_summary AS SELECT mdl_code, make, 
model , max(speed), max(alt) FROM master_data GROUP BY 
mdl_code, make, model; 
CREATE TABLE aircraf t_summary AS SELECT ident , make, 
model , max(speed), max(alt) FROM master_data GROUP BY 
ident , make, model; 
CREATE TABLE owner_summary AS SELECT owner_name, 
count(distinct(hexid) ) AS count_hex FROM master_data GROUP 
BY owner_name;
DATA SCIENCE & 
VISUALIZATIONS
FASTEST PLANES
TOP SPEED VS CRUISING 
ALTITUDE BY MAKE
UNIQUE FLIGHTS BY 
AIRLINE
NUMBER OF SIGHTINGS 
BY AIRCRAFT MAKE
Team3 presentation
Team3 presentation

More Related Content

What's hot

Snow cover assessment tool using Python
Snow cover assessment tool using PythonSnow cover assessment tool using Python
Snow cover assessment tool using Python
Prasun Kumar Gupta
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics
iosrjce
 
Big Data Processing in Pharo
Big Data Processing in PharoBig Data Processing in Pharo
Big Data Processing in Pharo
ESUG
 
Ross McDonald - PgRouting in QGIS
Ross McDonald - PgRouting in QGISRoss McDonald - PgRouting in QGIS
Ross McDonald - PgRouting in QGIS
Ross McDonald
 
LIDAR-derived DTM for archaeology and landscape history research some recent ...
LIDAR-derived DTM for archaeology and landscape history research some recent ...LIDAR-derived DTM for archaeology and landscape history research some recent ...
LIDAR-derived DTM for archaeology and landscape history research some recent ...
Shaun Lewis
 
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
Windows Developer
 
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
InfoFarm
 
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Dawn Wright
 
Data preparation covariates
Data preparation covariatesData preparation covariates
Data preparation covariates
FAO
 
Hadoop - Apache Hive
Hadoop - Apache HiveHadoop - Apache Hive
Hadoop - Apache Hive
Vibrant Technologies & Computers
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Jen Aman
 
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, PauliusVasil Remeniuk
 
LocationTech Projects
LocationTech ProjectsLocationTech Projects
LocationTech Projects
Jody Garnett
 
A time energy performance analysis of map reduce on heterogeneous systems wit...
A time energy performance analysis of map reduce on heterogeneous systems wit...A time energy performance analysis of map reduce on heterogeneous systems wit...
A time energy performance analysis of map reduce on heterogeneous systems wit...
newmooxx
 
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other OptimizationsMastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
scottcrespo
 
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and SparkFOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
Rob Emanuele
 
Hive
HiveHive
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Data Con LA
 
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
InfluxData
 
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
bzamecnik
 

What's hot (20)

Snow cover assessment tool using Python
Snow cover assessment tool using PythonSnow cover assessment tool using Python
Snow cover assessment tool using Python
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics
 
Big Data Processing in Pharo
Big Data Processing in PharoBig Data Processing in Pharo
Big Data Processing in Pharo
 
Ross McDonald - PgRouting in QGIS
Ross McDonald - PgRouting in QGISRoss McDonald - PgRouting in QGIS
Ross McDonald - PgRouting in QGIS
 
LIDAR-derived DTM for archaeology and landscape history research some recent ...
LIDAR-derived DTM for archaeology and landscape history research some recent ...LIDAR-derived DTM for archaeology and landscape history research some recent ...
LIDAR-derived DTM for archaeology and landscape history research some recent ...
 
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
 
First impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithmFirst impressions of SparkR: our own machine learning algorithm
First impressions of SparkR: our own machine learning algorithm
 
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
 
Data preparation covariates
Data preparation covariatesData preparation covariates
Data preparation covariates
 
Hadoop - Apache Hive
Hadoop - Apache HiveHadoop - Apache Hive
Hadoop - Apache Hive
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, Paulius
 
LocationTech Projects
LocationTech ProjectsLocationTech Projects
LocationTech Projects
 
A time energy performance analysis of map reduce on heterogeneous systems wit...
A time energy performance analysis of map reduce on heterogeneous systems wit...A time energy performance analysis of map reduce on heterogeneous systems wit...
A time energy performance analysis of map reduce on heterogeneous systems wit...
 
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other OptimizationsMastering Hadoop Map Reduce - Custom Types and Other Optimizations
Mastering Hadoop Map Reduce - Custom Types and Other Optimizations
 
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and SparkFOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
FOSDEM 2015: Distributed Tile Processing with GeoTrellis and Spark
 
Hive
HiveHive
Hive
 
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
Big Data Day LA 2015 - Large Scale Distinct Count -- The HyperLogLog algorith...
 
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
 
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
 

Viewers also liked

Big Data Conference April 2015
Big Data Conference April 2015Big Data Conference April 2015
Big Data Conference April 2015Aaron Benz
 
So you want to do a Big Data project?
So you want to do a Big Data project?So you want to do a Big Data project?
So you want to do a Big Data project?
Instanssi Oy
 
Big Data project offer for HSL
Big Data project offer for HSLBig Data project offer for HSL
Big Data project offer for HSL
Vladimir Orekhov
 
CitySprint Fleetmapper use case -Big Data Bootcamp
CitySprint  Fleetmapper use case -Big Data BootcampCitySprint  Fleetmapper use case -Big Data Bootcamp
CitySprint Fleetmapper use case -Big Data BootcampEduard Lazar
 
이민의 포트폴리오
이민의 포트폴리오이민의 포트폴리오
이민의 포트폴리오Min Lee
 
포트폴리오 오경원
포트폴리오 오경원포트폴리오 오경원
포트폴리오 오경원Sio Oh
 

Viewers also liked (6)

Big Data Conference April 2015
Big Data Conference April 2015Big Data Conference April 2015
Big Data Conference April 2015
 
So you want to do a Big Data project?
So you want to do a Big Data project?So you want to do a Big Data project?
So you want to do a Big Data project?
 
Big Data project offer for HSL
Big Data project offer for HSLBig Data project offer for HSL
Big Data project offer for HSL
 
CitySprint Fleetmapper use case -Big Data Bootcamp
CitySprint  Fleetmapper use case -Big Data BootcampCitySprint  Fleetmapper use case -Big Data Bootcamp
CitySprint Fleetmapper use case -Big Data Bootcamp
 
이민의 포트폴리오
이민의 포트폴리오이민의 포트폴리오
이민의 포트폴리오
 
포트폴리오 오경원
포트폴리오 오경원포트폴리오 오경원
포트폴리오 오경원
 

Similar to Team3 presentation

Cloudstone - Sharpening Your Weapons Through Big Data
Cloudstone - Sharpening Your Weapons Through Big DataCloudstone - Sharpening Your Weapons Through Big Data
Cloudstone - Sharpening Your Weapons Through Big Data
Christopher Grayson
 
Inter Task Communication On Volatile Nodes
Inter Task Communication On Volatile NodesInter Task Communication On Volatile Nodes
Inter Task Communication On Volatile Nodes
nagarajan_ka
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013
alanfgates
 
Hadoop trainingin bangalore
Hadoop trainingin bangaloreHadoop trainingin bangalore
Hadoop trainingin bangalore
appaji intelhunt
 
Data Science
Data ScienceData Science
Data Science
Subhajit75
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
DataWorks Summit
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
Pietro Michiardi
 
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Alex Levenson
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
Xiao Qin
 
Hadoop
HadoopHadoop
2014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part12014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1
Adam Muise
 
MATLAB_BIg_Data_ds_Haddop_22032015
MATLAB_BIg_Data_ds_Haddop_22032015MATLAB_BIg_Data_ds_Haddop_22032015
MATLAB_BIg_Data_ds_Haddop_22032015Asaf Ben Gal
 
Osd ctw spark
Osd ctw sparkOsd ctw spark
Osd ctw spark
Wisely chen
 
The Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightThe Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsight
Gert Drapers
 
Advanced .NET Data Access with Dapper
Advanced .NET Data Access with Dapper Advanced .NET Data Access with Dapper
Advanced .NET Data Access with Dapper
David Paquette
 
Large-Scaled Telematics Analytics in Apache Spark with Wayne Zhang and Neil P...
Large-Scaled Telematics Analytics in Apache Spark with Wayne Zhang and Neil P...Large-Scaled Telematics Analytics in Apache Spark with Wayne Zhang and Neil P...
Large-Scaled Telematics Analytics in Apache Spark with Wayne Zhang and Neil P...
Databricks
 
Hadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticiansHadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticians
attilacsordas
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
Jim Dowling
 
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09   Hadoop Development At Facebook  Hive And HdfsHw09   Hadoop Development At Facebook  Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And HdfsCloudera, Inc.
 
Scalding Big (Ad)ta
Scalding Big (Ad)taScalding Big (Ad)ta
Scalding Big (Ad)ta
b0ris_1
 

Similar to Team3 presentation (20)

Cloudstone - Sharpening Your Weapons Through Big Data
Cloudstone - Sharpening Your Weapons Through Big DataCloudstone - Sharpening Your Weapons Through Big Data
Cloudstone - Sharpening Your Weapons Through Big Data
 
Inter Task Communication On Volatile Nodes
Inter Task Communication On Volatile NodesInter Task Communication On Volatile Nodes
Inter Task Communication On Volatile Nodes
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013
 
Hadoop trainingin bangalore
Hadoop trainingin bangaloreHadoop trainingin bangalore
Hadoop trainingin bangalore
 
Data Science
Data ScienceData Science
Data Science
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
 
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
 
Hadoop
HadoopHadoop
Hadoop
 
2014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part12014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1
 
MATLAB_BIg_Data_ds_Haddop_22032015
MATLAB_BIg_Data_ds_Haddop_22032015MATLAB_BIg_Data_ds_Haddop_22032015
MATLAB_BIg_Data_ds_Haddop_22032015
 
Osd ctw spark
Osd ctw sparkOsd ctw spark
Osd ctw spark
 
The Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightThe Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsight
 
Advanced .NET Data Access with Dapper
Advanced .NET Data Access with Dapper Advanced .NET Data Access with Dapper
Advanced .NET Data Access with Dapper
 
Large-Scaled Telematics Analytics in Apache Spark with Wayne Zhang and Neil P...
Large-Scaled Telematics Analytics in Apache Spark with Wayne Zhang and Neil P...Large-Scaled Telematics Analytics in Apache Spark with Wayne Zhang and Neil P...
Large-Scaled Telematics Analytics in Apache Spark with Wayne Zhang and Neil P...
 
Hadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticiansHadoop 101 for bioinformaticians
Hadoop 101 for bioinformaticians
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09   Hadoop Development At Facebook  Hive And HdfsHw09   Hadoop Development At Facebook  Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And Hdfs
 
Scalding Big (Ad)ta
Scalding Big (Ad)taScalding Big (Ad)ta
Scalding Big (Ad)ta
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 

Team3 presentation

  • 1. Hal im Abbas Rashid Al i Amanda Gi lber t LaTia Jef ferson THINK BIG BOOTCAMP PROJECT
  • 3. METHODOLOGY  Created Python MapReduce job to format the data for ingestion  Python dictionary to deal with paired data  Logic to ignore lines with data issues  Executed a Hadoop streaming job to ingest the data  Loaded the data into tables via hive  I n g e s te d FA A’ s A i rc r af t Re g i st r y d a t a  Re- ingested data by site
  • 4.
  • 6. INITIAL EXPLORATION  Most frequently repor ting craf ts  Percentage of records accepted & ingested  99.80%  1924 lines rejected
  • 7. SITE COMPARISON Site 1 Site 2 Number of Sightings 563715 449904 Average Speed 342.02 395.37 Average Alt 15919.28 20295.33
  • 8. SITE LATITUDE LONGITUDE DATA  Sampled latitude and longitude data both sites  Found average latitude and longitude for each  Site one: 42.22, -70.85  Site two: 42.12, -71.49
  • 9. SITE 1 RELATIVE LOCATION
  • 10. SITE 2 RELATIVE LOCATION
  • 11. MASTER DATA QUERIES CREATE TABLE model_summary AS SELECT mdl_code, make, model , max(speed), max(alt) FROM master_data GROUP BY mdl_code, make, model; CREATE TABLE aircraf t_summary AS SELECT ident , make, model , max(speed), max(alt) FROM master_data GROUP BY ident , make, model; CREATE TABLE owner_summary AS SELECT owner_name, count(distinct(hexid) ) AS count_hex FROM master_data GROUP BY owner_name;
  • 12. DATA SCIENCE & VISUALIZATIONS
  • 14.
  • 15. TOP SPEED VS CRUISING ALTITUDE BY MAKE
  • 16.
  • 17. UNIQUE FLIGHTS BY AIRLINE
  • 18.
  • 19. NUMBER OF SIGHTINGS BY AIRCRAFT MAKE

Editor's Notes

  1. (add top 3)