SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1670
Study of Prediction Algorithms on Aviation Accident Dataset using
Rapid Miner
Dishank Kaji1, Dhrumil Mehta2, Divyesh Sanghani3, Harsh Shah4, Rashmi Malvankar5
5Under the Guidance of Prof. Rashmi Malvankar, Department of Information Technology,
1,2,3,4Shah and Anchor Kutchhi Engineering College, Mumbai, Maharashtra, India.
-----------------------------------------------------------------------***-----------------------------------------------------------------------
Abstract - A Flight crash is caused due to multiple
factors which can lead to end of many life. In the field of
aircraft, Person’s safety is always considered as an
important factor. Further Analyzing any kind of an incident
in aircraft will help to save life and cost. Prediction is made
based on huge collection of flight records using data mining
techniques and algorithms. To minimize these accidents,
factors causing or contributing to accidents must be
prevented. Flight crash can be caused due to n number of
factors. In this research work, comparative analysis of the
different algorithms like Decision tree and Naive Bayes will
be implemented and a decision will be made on various
parameters whether a flight is ready to take off or not. The
analysis identifies relationships between the accident and
incident data and finds patterns of contributory factors
which are significantly associated with aircraft accidents.
Key Words: Decision Tree, Naive Bayesian
1. INTRODUCTION
Data mining has been intensively and extensively used in
many fields and applications. The data generation and
collection are considered as an important factor for
producing data sets from different kind of sources. Data is
comprehensively collected from NTSB, which records all
the aircraft accidents and it is used as training dataset for
the proposed system. Various attributes which caused the
accidents are analyzed. Huge collections of past accident
records were analyzed from different reports and record
were filtered accordingly to perform different algorithms
and each record was found to be erroneous. Decision Tree
is implemented in a more effective way to handle
nonlinear data. The Naive Bayesian classifier is based on
Bayes’ theorem and it is executed in an effective way such
that it compares records with the independence
assumptions between predictors. A Naive Bayesian model
is easy to build, with no complicated iterative parameter
estimation which makes it particularly for huge collection
of data.
2. PROBLEM DEFINATION
To work with unstructured data and processing it
dynamically is very difficult, requires effort, man power
and is highly time consuming and there is high percentage
of chances that low standard results will be obtained.
Airline Accident is caused by numerous factors and data
of flight crash are unpredictable. So, to process huge
collection of data that has the tendency to change
frequently, various automated tools must be used. In this
case Rapid Miner.
3. DATA COLLECTION
Data is collected from National Transport Safety Board
(NTSB). NTSB keeps record of all the accidents occurred
across all means of transport and has been doing so for
more than ten years. We will be using their aviation
accident dataset which has more than 70,000 records
which gives more data to algorithms to train. This data is
available in .xml, .csv and .txt format.
4. DATA SELECTION AND PREPROCESSING
The NTSB dataset consists of data with large number of
attributes like Type of Event, Location, Longitude,
Latitude, Injury Severity, Aircraft Category, Make, Engine
type, Schedule, Purpose of Flight, Total fatal injuries,
Weather Conditions, Total Uninjured.
Not all of these attributes contribute towards the crash
and so the data must be filtered i.e. only the relevant
attributes must be chosen rest all shall be excluded and
smoothen i.e. the missing values must be taken care of. We
have replaced the missing value with the mean. The
attributes chosen that are relevant towards predicting the
crash are:
i) InjurySeverity: Highest level of injury.
ii) AircraftCategory: It defines the rating,
certification, limitations and privileges of an
air personnel.
iii) Make: Build of the aircraft.
iv) AmateurBuilt
v) Weather condition: There are two types of
weather conditions when concerned to aviation. First
IMC (Instrument meteorological conditions) and
second VMC (visual meteorological conditions)
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1671
3. ALGORITHMS OF PREDICTION
A. DECISION TREE
Decision Tree algorithm is a member of supervised
learning algorithms. Decision tree algorithm is helpful in
filtering regression and classification problems. The
objective of this algorithm is to create a training model
which can be used to predict aircraft accident or decision
rules inferred from prior data. First the whole training set
is considered as the root. Then Feature values are
preferred to be categorical. If the values are continuous
then they are discretized prior to building the model.
Records are distributed recursively on the basis of
attribute values. Order to placing attributes as root or
internal node of the tree is done by using some statistical
approach.ID3 is the decision algorithm used for accident
prediction. ID3 uses information gain as its attribute
selection measure. It decides which attribute (Splitting
point) to test at node N by determining the best way to
separate or partition the tuples into individual classes.
Fig-1: Decision Tree (Rapid miner tool)
RESULT: The Accuracy obtained was 97.77%
PERFORMANCE VECTOR:
Table-1: Performance Vector of Decision Tree
Parameters True Yes True No Class Recall
Pred Yes 22152 488 97.84%
Pred No 39 983 96.18%
Class Recall 99.82% 66.83%
B. NAÏVE BAYESIAN
A Naive Bayes (NB) classifier is a simple probabilistic
classifier based on Bayes theorem where every feature is
assumed to be class-conditionally independent. In naïve
Bayes learning, each instance is described by a set of
features and takes a class value from a predefined set of
values. Classification of instances gets difficult when the
dataset contains a large number of features and classes
because it takes enormous numbers of observations to
estimate the probabilities. In simple terms, a Naive Bayes
classifier assumes that the presence of a particular feature
in a class is unrelated to the presence of any other feature.
Naive Bayes model is easy to build and particularly useful
for very large data sets. Along with simplicity, Naive Bayes
is known to outperform classification methods.
Fig-2: Naïve Based (Rapid miner tool)
RESULT: The Accuracy obtained was 97.31%
PERFORMANCE VECTOR:
Table-2: Performance Vector of Naive Bayesian
Parameters True Yes True No Class Recall
Pred Yes 22011 456 97.97%
Pred No 180 1015 84.94%
Class Recall 99.19% 69.00%
4. CONCLUSION
Data mining platform is very helpful in performing
prediction analysis of an improper dynamic data.
Prediction of aircraft accident is a critical factor. Thus, it
has been implemented with two efficient data mining
algorithms Naive Bayes and Decision Tree. These
algorithms are efficiently used in many fields. Two
algorithms act efficiently in different real time conditions.
On usage of these algorithms’ aircraft accident has been
predicted with standard accuracy.
5. REFERENCES
[1]https://www.ripublication.com/ijaer18/ijaerv13n21_
01.pdf.
[2] http://dataaspirant.com/2017/01/30/how-decision-
tree-algorithm-works/ .
[3] https://ieeexplore.ieee.org/document/6528602.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1672
[4]http://dataillumination.blogspot.com/2015/03/predic
ting-flights-delay-using.html
[5]https://www.ripublication.com/acst17/acstv10n3_10.
pdf
[6]https://www.researchgate.net/publication/32136205
3_Accident_Analysis_by_using_Data_Mining_Techniques
[7] https://www.grin.com/document/386836

More Related Content

What's hot

What's hot (20)

Ijciet 10 01_161
Ijciet 10 01_161Ijciet 10 01_161
Ijciet 10 01_161
 
IRJET - Breast Cancer Risk and Diagnostics using Artificial Neural Network(ANN)
IRJET - Breast Cancer Risk and Diagnostics using Artificial Neural Network(ANN)IRJET - Breast Cancer Risk and Diagnostics using Artificial Neural Network(ANN)
IRJET - Breast Cancer Risk and Diagnostics using Artificial Neural Network(ANN)
 
IRJET - Survey on Analysis of Breast Cancer Prediction
IRJET - Survey on Analysis of Breast Cancer PredictionIRJET - Survey on Analysis of Breast Cancer Prediction
IRJET - Survey on Analysis of Breast Cancer Prediction
 
Techniques to Apply Artificial Intelligence in Power Plants
Techniques to Apply Artificial Intelligence in Power PlantsTechniques to Apply Artificial Intelligence in Power Plants
Techniques to Apply Artificial Intelligence in Power Plants
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain
 
IRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its AnalysisIRJET- Probability based Missing Value Imputation Method and its Analysis
IRJET- Probability based Missing Value Imputation Method and its Analysis
 
Artificial Intelligence based Pattern Recognition
Artificial Intelligence based Pattern RecognitionArtificial Intelligence based Pattern Recognition
Artificial Intelligence based Pattern Recognition
 
Automatic Target Detection using Maximum Average Correlation Height Filter an...
Automatic Target Detection using Maximum Average Correlation Height Filter an...Automatic Target Detection using Maximum Average Correlation Height Filter an...
Automatic Target Detection using Maximum Average Correlation Height Filter an...
 
IRJET - Detection of Malaria using Image Cells
IRJET - Detection of Malaria using Image CellsIRJET - Detection of Malaria using Image Cells
IRJET - Detection of Malaria using Image Cells
 
IRJET - Finger Vein Extraction and Authentication System for ATM
IRJET -  	  Finger Vein Extraction and Authentication System for ATMIRJET -  	  Finger Vein Extraction and Authentication System for ATM
IRJET - Finger Vein Extraction and Authentication System for ATM
 
Data Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological DataData Analysis and Prediction System for Meteorological Data
Data Analysis and Prediction System for Meteorological Data
 
Data analysis of weather forecasting
Data analysis of weather forecastingData analysis of weather forecasting
Data analysis of weather forecasting
 
IRJET - Analysis of Crop Yield Prediction by using Machine Learning Algorithms
IRJET - Analysis of Crop Yield Prediction by using Machine Learning AlgorithmsIRJET - Analysis of Crop Yield Prediction by using Machine Learning Algorithms
IRJET - Analysis of Crop Yield Prediction by using Machine Learning Algorithms
 
40120130406006
4012013040600640120130406006
40120130406006
 
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...
 
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...Improving the Performance of Mapping based on Availability- Alert Algorithm U...
Improving the Performance of Mapping based on Availability- Alert Algorithm U...
 
Hybrid Approach for Intrusion Detection Model Using Combination of K-Means Cl...
Hybrid Approach for Intrusion Detection Model Using Combination of K-Means Cl...Hybrid Approach for Intrusion Detection Model Using Combination of K-Means Cl...
Hybrid Approach for Intrusion Detection Model Using Combination of K-Means Cl...
 
Data-Driven Hydrocarbon Production Forecasting Using Machine Learning Techniques
Data-Driven Hydrocarbon Production Forecasting Using Machine Learning TechniquesData-Driven Hydrocarbon Production Forecasting Using Machine Learning Techniques
Data-Driven Hydrocarbon Production Forecasting Using Machine Learning Techniques
 
IRJET- Agricultural Crop Yield Prediction using Deep Learning Approach
IRJET-  	  Agricultural Crop Yield Prediction using Deep Learning ApproachIRJET-  	  Agricultural Crop Yield Prediction using Deep Learning Approach
IRJET- Agricultural Crop Yield Prediction using Deep Learning Approach
 
IRJET - Survey on Clustering based Categorical Data Protection
IRJET - Survey on Clustering based Categorical Data ProtectionIRJET - Survey on Clustering based Categorical Data Protection
IRJET - Survey on Clustering based Categorical Data Protection
 

Similar to IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapid Miner

Predicting the Maintenance of Aircraft Engines using LSTM
Predicting the Maintenance of Aircraft Engines using LSTMPredicting the Maintenance of Aircraft Engines using LSTM
Predicting the Maintenance of Aircraft Engines using LSTM
ijtsrd
 
An intrusion detection algorithm for ami
An intrusion detection algorithm for amiAn intrusion detection algorithm for ami
An intrusion detection algorithm for ami
IJCI JOURNAL
 

Similar to IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapid Miner (20)

Departure Delay Prediction using Machine Learning.
Departure Delay Prediction using Machine Learning.Departure Delay Prediction using Machine Learning.
Departure Delay Prediction using Machine Learning.
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
 
Network Intrusion Detection System Based on Modified Random Forest Classifier...
Network Intrusion Detection System Based on Modified Random Forest Classifier...Network Intrusion Detection System Based on Modified Random Forest Classifier...
Network Intrusion Detection System Based on Modified Random Forest Classifier...
 
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and RSvm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
 
IRJET- Titanic Survival Analysis using Logistic Regression
IRJET-  	  Titanic Survival Analysis using Logistic RegressionIRJET-  	  Titanic Survival Analysis using Logistic Regression
IRJET- Titanic Survival Analysis using Logistic Regression
 
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
IRJET-A Hybrid Intrusion Detection Technique based on IRF & AODE for KDD-CUP ...
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
 
Predicting the Maintenance of Aircraft Engines using LSTM
Predicting the Maintenance of Aircraft Engines using LSTMPredicting the Maintenance of Aircraft Engines using LSTM
Predicting the Maintenance of Aircraft Engines using LSTM
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
 
An intrusion detection algorithm for ami
An intrusion detection algorithm for amiAn intrusion detection algorithm for ami
An intrusion detection algorithm for ami
 
IRJET- Machine Learning based Network Security
IRJET-  	  Machine Learning based Network SecurityIRJET-  	  Machine Learning based Network Security
IRJET- Machine Learning based Network Security
 
IRJET- Intrusion Detection using IP Binding in Real Network
IRJET- Intrusion Detection using IP Binding in Real NetworkIRJET- Intrusion Detection using IP Binding in Real Network
IRJET- Intrusion Detection using IP Binding in Real Network
 
Classification and Prediction Based Data Mining Algorithm in Weka Tool
Classification and Prediction Based Data Mining Algorithm in Weka ToolClassification and Prediction Based Data Mining Algorithm in Weka Tool
Classification and Prediction Based Data Mining Algorithm in Weka Tool
 
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNINGTRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
 
IRJET- Machine Learning and Deep Learning Methods for Cybersecurity
IRJET- Machine Learning and Deep Learning Methods for CybersecurityIRJET- Machine Learning and Deep Learning Methods for Cybersecurity
IRJET- Machine Learning and Deep Learning Methods for Cybersecurity
 
IRJET- Fault Detection and Maintenance Prediction for Gear of an Industri...
IRJET-  	  Fault Detection and Maintenance Prediction for Gear of an Industri...IRJET-  	  Fault Detection and Maintenance Prediction for Gear of an Industri...
IRJET- Fault Detection and Maintenance Prediction for Gear of an Industri...
 
Implementation of Secured Network Based Intrusion Detection System Using SVM ...
Implementation of Secured Network Based Intrusion Detection System Using SVM ...Implementation of Secured Network Based Intrusion Detection System Using SVM ...
Implementation of Secured Network Based Intrusion Detection System Using SVM ...
 
A statistical approach to predict flight delay
A statistical approach to predict flight delayA statistical approach to predict flight delay
A statistical approach to predict flight delay
 
Accident Prediction System Using Machine Learning
Accident Prediction System Using Machine LearningAccident Prediction System Using Machine Learning
Accident Prediction System Using Machine Learning
 
Enhanced Bee Colony Optimization Mechanism In Content Recommendation System
Enhanced Bee Colony Optimization Mechanism In Content Recommendation SystemEnhanced Bee Colony Optimization Mechanism In Content Recommendation System
Enhanced Bee Colony Optimization Mechanism In Content Recommendation System
 

More from IRJET Journal

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Recently uploaded (20)

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 

IRJET- Study of Prediction Algorithms on Aviation Accident Dataset using Rapid Miner

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1670 Study of Prediction Algorithms on Aviation Accident Dataset using Rapid Miner Dishank Kaji1, Dhrumil Mehta2, Divyesh Sanghani3, Harsh Shah4, Rashmi Malvankar5 5Under the Guidance of Prof. Rashmi Malvankar, Department of Information Technology, 1,2,3,4Shah and Anchor Kutchhi Engineering College, Mumbai, Maharashtra, India. -----------------------------------------------------------------------***----------------------------------------------------------------------- Abstract - A Flight crash is caused due to multiple factors which can lead to end of many life. In the field of aircraft, Person’s safety is always considered as an important factor. Further Analyzing any kind of an incident in aircraft will help to save life and cost. Prediction is made based on huge collection of flight records using data mining techniques and algorithms. To minimize these accidents, factors causing or contributing to accidents must be prevented. Flight crash can be caused due to n number of factors. In this research work, comparative analysis of the different algorithms like Decision tree and Naive Bayes will be implemented and a decision will be made on various parameters whether a flight is ready to take off or not. The analysis identifies relationships between the accident and incident data and finds patterns of contributory factors which are significantly associated with aircraft accidents. Key Words: Decision Tree, Naive Bayesian 1. INTRODUCTION Data mining has been intensively and extensively used in many fields and applications. The data generation and collection are considered as an important factor for producing data sets from different kind of sources. Data is comprehensively collected from NTSB, which records all the aircraft accidents and it is used as training dataset for the proposed system. Various attributes which caused the accidents are analyzed. Huge collections of past accident records were analyzed from different reports and record were filtered accordingly to perform different algorithms and each record was found to be erroneous. Decision Tree is implemented in a more effective way to handle nonlinear data. The Naive Bayesian classifier is based on Bayes’ theorem and it is executed in an effective way such that it compares records with the independence assumptions between predictors. A Naive Bayesian model is easy to build, with no complicated iterative parameter estimation which makes it particularly for huge collection of data. 2. PROBLEM DEFINATION To work with unstructured data and processing it dynamically is very difficult, requires effort, man power and is highly time consuming and there is high percentage of chances that low standard results will be obtained. Airline Accident is caused by numerous factors and data of flight crash are unpredictable. So, to process huge collection of data that has the tendency to change frequently, various automated tools must be used. In this case Rapid Miner. 3. DATA COLLECTION Data is collected from National Transport Safety Board (NTSB). NTSB keeps record of all the accidents occurred across all means of transport and has been doing so for more than ten years. We will be using their aviation accident dataset which has more than 70,000 records which gives more data to algorithms to train. This data is available in .xml, .csv and .txt format. 4. DATA SELECTION AND PREPROCESSING The NTSB dataset consists of data with large number of attributes like Type of Event, Location, Longitude, Latitude, Injury Severity, Aircraft Category, Make, Engine type, Schedule, Purpose of Flight, Total fatal injuries, Weather Conditions, Total Uninjured. Not all of these attributes contribute towards the crash and so the data must be filtered i.e. only the relevant attributes must be chosen rest all shall be excluded and smoothen i.e. the missing values must be taken care of. We have replaced the missing value with the mean. The attributes chosen that are relevant towards predicting the crash are: i) InjurySeverity: Highest level of injury. ii) AircraftCategory: It defines the rating, certification, limitations and privileges of an air personnel. iii) Make: Build of the aircraft. iv) AmateurBuilt v) Weather condition: There are two types of weather conditions when concerned to aviation. First IMC (Instrument meteorological conditions) and second VMC (visual meteorological conditions)
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1671 3. ALGORITHMS OF PREDICTION A. DECISION TREE Decision Tree algorithm is a member of supervised learning algorithms. Decision tree algorithm is helpful in filtering regression and classification problems. The objective of this algorithm is to create a training model which can be used to predict aircraft accident or decision rules inferred from prior data. First the whole training set is considered as the root. Then Feature values are preferred to be categorical. If the values are continuous then they are discretized prior to building the model. Records are distributed recursively on the basis of attribute values. Order to placing attributes as root or internal node of the tree is done by using some statistical approach.ID3 is the decision algorithm used for accident prediction. ID3 uses information gain as its attribute selection measure. It decides which attribute (Splitting point) to test at node N by determining the best way to separate or partition the tuples into individual classes. Fig-1: Decision Tree (Rapid miner tool) RESULT: The Accuracy obtained was 97.77% PERFORMANCE VECTOR: Table-1: Performance Vector of Decision Tree Parameters True Yes True No Class Recall Pred Yes 22152 488 97.84% Pred No 39 983 96.18% Class Recall 99.82% 66.83% B. NAÏVE BAYESIAN A Naive Bayes (NB) classifier is a simple probabilistic classifier based on Bayes theorem where every feature is assumed to be class-conditionally independent. In naïve Bayes learning, each instance is described by a set of features and takes a class value from a predefined set of values. Classification of instances gets difficult when the dataset contains a large number of features and classes because it takes enormous numbers of observations to estimate the probabilities. In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. Naive Bayes model is easy to build and particularly useful for very large data sets. Along with simplicity, Naive Bayes is known to outperform classification methods. Fig-2: Naïve Based (Rapid miner tool) RESULT: The Accuracy obtained was 97.31% PERFORMANCE VECTOR: Table-2: Performance Vector of Naive Bayesian Parameters True Yes True No Class Recall Pred Yes 22011 456 97.97% Pred No 180 1015 84.94% Class Recall 99.19% 69.00% 4. CONCLUSION Data mining platform is very helpful in performing prediction analysis of an improper dynamic data. Prediction of aircraft accident is a critical factor. Thus, it has been implemented with two efficient data mining algorithms Naive Bayes and Decision Tree. These algorithms are efficiently used in many fields. Two algorithms act efficiently in different real time conditions. On usage of these algorithms’ aircraft accident has been predicted with standard accuracy. 5. REFERENCES [1]https://www.ripublication.com/ijaer18/ijaerv13n21_ 01.pdf. [2] http://dataaspirant.com/2017/01/30/how-decision- tree-algorithm-works/ . [3] https://ieeexplore.ieee.org/document/6528602.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 04 | Apr 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1672 [4]http://dataillumination.blogspot.com/2015/03/predic ting-flights-delay-using.html [5]https://www.ripublication.com/acst17/acstv10n3_10. pdf [6]https://www.researchgate.net/publication/32136205 3_Accident_Analysis_by_using_Data_Mining_Techniques [7] https://www.grin.com/document/386836