In the past decades, a lot of effort has been put into roadway traffic safety. With the help of data mining,
the analysis of roadway traffic data is much needed to understand the factors related to fatal accidents.
This paper analyses Fatality Analysis Reporting System (FARS) dataset using several data mining
algorithms. Here, we compare the performance of four meta-classifiers and four data-oriented techniques
known for their ability to handle imbalanced datasets, entirely based on Random Forest classifier. Also,
we study the effect of applying several feature selection algorithms including PSO, Cuckoo, Bat and Tabu
on improving the accuracy and efficiency of classification. The empirical results show that the Threshold
selector meta-classifier combined with over-sampling techniques results were very satisfactory. In this
regard, the proposed technique has gained a mean overall Accuracy of 91% and a Balanced Accuracy that
varies between 96% to 99% using 7-15 features instead of 50 original features.
EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...ijcsa
Road traffic accidents are the most common accidents that annually Endangers lives of many people in the world. Our country Iran is one of the countries with highest incidence and mortality due to accidents that has been introduced. So it’s requires identification of underlay in dimensions in this field. Due to the increasing amount of car accidents in order to increase volume of information related to car accidents and needs to explore and reveal hidden dependencies and very long time among this information. So using traditional methods to discover these complex relations don't response between involved factors and we need to use new techniques. Considering that main aim of this paper is to find best relationship between volumes of information in shortest time. So, in this paper, we classify accidents in West Azerbaijan province in Iran by accident type (damage, injury, death) and we describe it by using Particle Swarm Optimization (PSO) algorithm
4Data Mining Approach of Accident Occurrences Identification with Effective M...IJECEIAES
Data mining is used in various domains of research to identify a new cause for tan effect in the society over the globe. This article includes the same reason for using the data mining to identify the Accident Occurrences in different regions and to identify the most valid reason for happening accidents over the globe. Data Mining and Advanced Machine Learning algorithms are used in this research approach and this article discusses about hyperline, classifications, pre-processing of the data, training the machine with the sample datasets which are collected from different regions in which we have structural and semi-structural data. We will dive into deep of machine learning and data mining classification algorithms to find or predict something novel about the accident occurrences over the globe. We majorly concentrate on two classification algorithms to minify the research and task and they are very basic and important classification algorithms. SVM (Support vector machine), CNB Classifier. This discussion will be quite interesting with WEKA tool for CNB classifier, Bag of Words Identification, Word Count and Frequency Calculation.
Identification of road traffic fatal crashes leading factors using principal ...eSAT Journals
Abstract
Traffic crash fatalities create primary safety concern beyond the traffic congestion and delay. Therefore, the purpose of this study
is to identify the principal components/factors associated with road traffic crash in the U.S. through retrospective reviewing based
on more than two million records of fatal crashes and 38 years (1975-2012) of National Highway Traffic Safety Administration
official’s Fatal Accident Reporting System (FARS) database. This study portrays an integrated geographic information system
and SAS application in order to find the major factors forcing traffic crashes. The resulting geospatial analysis and principal
components analysis yielded critical significant factors causing fatal traffic crashes. The outcomes of this research could be used
in transportation safety policy making and planning significantly.
Key Words: Accident Analysis Prevention, Clustering, Crash Hot Spot, Geographic Information Systems, Principal
Components Analysis, and Traffic Crash
APPLYING DATA ENVELOPMENT ANALYSIS AND CLUSTERING ANALYSIS IN ENHANCING THE P...IJITCA Journal
Data envelopment analysis is a technique or method for assessing and evaluating the relative performance
of organizational entities where the manifestation of multiple inputs and outputs makes comparison
difficult. Efficiency was measured through data envelopment analysis in Philippine National Police District
VI in the Province of Cavite to measure the performance of decision-making units in terms of their
resources. Clustering is the process of grouping and analyzing the list of objects which have similar
characteristics. Clustering algorithm is used in this study to help identify crime pattern. The clustering
algorithm was implemented in the application software Crime Management System (CriMS) to predict the
crime pattern to help Philippine National Police District VI in the Province of Cavite in decreasing the
total number of crime volume and increase the number of crimes solved to countervail security concerns of
an individual, community, and the state. Further studies must be conducted to determine the usefulness of
the application software by leading to an empirical study on the rule set used to determine the predictive
accuracy and/or software productivity.
EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...ijcsa
Road traffic accidents are the most common accidents that annually Endangers lives of many people in the world. Our country Iran is one of the countries with highest incidence and mortality due to accidents that has been introduced. So it’s requires identification of underlay in dimensions in this field. Due to the increasing amount of car accidents in order to increase volume of information related to car accidents and needs to explore and reveal hidden dependencies and very long time among this information. So using traditional methods to discover these complex relations don't response between involved factors and we need to use new techniques. Considering that main aim of this paper is to find best relationship between volumes of information in shortest time. So, in this paper, we classify accidents in West Azerbaijan province in Iran by accident type (damage, injury, death) and we describe it by using Particle Swarm Optimization (PSO) algorithm
4Data Mining Approach of Accident Occurrences Identification with Effective M...IJECEIAES
Data mining is used in various domains of research to identify a new cause for tan effect in the society over the globe. This article includes the same reason for using the data mining to identify the Accident Occurrences in different regions and to identify the most valid reason for happening accidents over the globe. Data Mining and Advanced Machine Learning algorithms are used in this research approach and this article discusses about hyperline, classifications, pre-processing of the data, training the machine with the sample datasets which are collected from different regions in which we have structural and semi-structural data. We will dive into deep of machine learning and data mining classification algorithms to find or predict something novel about the accident occurrences over the globe. We majorly concentrate on two classification algorithms to minify the research and task and they are very basic and important classification algorithms. SVM (Support vector machine), CNB Classifier. This discussion will be quite interesting with WEKA tool for CNB classifier, Bag of Words Identification, Word Count and Frequency Calculation.
Identification of road traffic fatal crashes leading factors using principal ...eSAT Journals
Abstract
Traffic crash fatalities create primary safety concern beyond the traffic congestion and delay. Therefore, the purpose of this study
is to identify the principal components/factors associated with road traffic crash in the U.S. through retrospective reviewing based
on more than two million records of fatal crashes and 38 years (1975-2012) of National Highway Traffic Safety Administration
official’s Fatal Accident Reporting System (FARS) database. This study portrays an integrated geographic information system
and SAS application in order to find the major factors forcing traffic crashes. The resulting geospatial analysis and principal
components analysis yielded critical significant factors causing fatal traffic crashes. The outcomes of this research could be used
in transportation safety policy making and planning significantly.
Key Words: Accident Analysis Prevention, Clustering, Crash Hot Spot, Geographic Information Systems, Principal
Components Analysis, and Traffic Crash
APPLYING DATA ENVELOPMENT ANALYSIS AND CLUSTERING ANALYSIS IN ENHANCING THE P...IJITCA Journal
Data envelopment analysis is a technique or method for assessing and evaluating the relative performance
of organizational entities where the manifestation of multiple inputs and outputs makes comparison
difficult. Efficiency was measured through data envelopment analysis in Philippine National Police District
VI in the Province of Cavite to measure the performance of decision-making units in terms of their
resources. Clustering is the process of grouping and analyzing the list of objects which have similar
characteristics. Clustering algorithm is used in this study to help identify crime pattern. The clustering
algorithm was implemented in the application software Crime Management System (CriMS) to predict the
crime pattern to help Philippine National Police District VI in the Province of Cavite in decreasing the
total number of crime volume and increase the number of crimes solved to countervail security concerns of
an individual, community, and the state. Further studies must be conducted to determine the usefulness of
the application software by leading to an empirical study on the rule set used to determine the predictive
accuracy and/or software productivity.
Experience Mazda Zoom Zoom Lifestyle and Culture by Visiting and joining the Official Mazda Community at http://www.MazdaCommunity.org for additional insight into the Zoom Zoom Lifestyle and special offers for Mazda Community Members. If you live in Arizona, check out CardinaleWay Mazda's eCommerce website at http://www.Cardinale-Way-Mazda.com
Comparing the two techniques Tripod Beta and Mort at a critical accident anal...IJERA Editor
Accidents are one of the leading causes of death and disability. Despite great efforts made to prevent accidents, there is still no coherent system to identify the root causes of industrial accidents. Selection of appropriate accident analysis techniques and their comparison can be useful in this regard. This research aimed to analyze a fatal accident in a power plant construction project using the two methods of MORT and Tripod-Beta, and the comparison of the analyses. First, the report of the selected accident was studied, and the accident was analyzed by the two methods of MORT and Tripod-Beta. The next step was followed by the comparison and assessment of the methods of MORT and Tripod-Beta with the measures of time, cost, training needs, the need for technical forces, the number of causes identified, quantifiable, and the need for software to conduct analysis. The Tripod-Beta accident analysis cost less and requires less time, and less technical experts. Thorough analysis of major accidents needs to identify all the possible causes of the incident, including human error and equipment failure. Therefore, the complimentary use of both techniques of industrial accident analysis is recommended.
A multi-objective evolutionary scheme for control points deployment in intell...IJECEIAES
One of the problems that hinder emergency in developing countries is the problem of monitoring a number of activities on inter-urban roadway networks. In the literature, the use of control points is proposed in the context of these countries in order to ensure efficient monitoring, by ensuring a good coverage while minimizing the installation costs as well as the number of accidents across these road networks. In this work, we propose an optimal deployment of these control points from several optimization methods based on some evolutionary multi-objective algorithms: the Non dominated sorting genetic algorithm-II (NSGA-II); the multi-objective particle swarm optimization (MOPSO); the strength pareto evolutionary algorithm-II (SPEA-II); and the pareto envelope based selection algorithm-II (PESA-II). We performed the tests and compared these deployments using pareto front and performance indicators like the spread and hypervolume and the inverted generational distance (IGD). The results obtained show that the NSGA-II method is the most adequate in the deployment of these control points.
Public Perception on Road Accidents: A Case Study of Mahasarakham City, Thailanddrboon
This study surveyed public perception on the causes of roadway accidents in Mahasarakam City area. There are 400 questionnaire respondents from different ages, occupations, and background. The majority of respondents never have an experience of confronting crash. Average perception of all respondents indicates that human is the most dominant factor causing road accidents. In addition, most respondents give high score to drunk driving as the most dangerous driving behavior. Respondents with different educational backgrounds seem to have agreeable perception on most of the items that cause street accidents.
Classification and Predication of Breast Cancer Risk Factors Using Id3theijes
Breast cancer considered as the first or second cause of death of women each year. There are many risk factors give an indication for the possibility of future cancer development. Therefore, a predictive system depends on risk factors will help in early detection of this disease. Here we used an ID3, by WEKA software, J48 algorithm to build a predictive system of a classifier tree to classify the dataset groups of risk factors according to its priority and gives a future likelihood to develop cancer, the results show that the family history is the most affecting factor in addition to other factors
PREDICTING ACCIDENT SEVERITY: AN ANALYSIS OF FACTORS AFFECTING ACCIDENT SEVER...IJCI JOURNAL
Road accidents have significant economic and societal costs, with a small number of severe accidents
accounting for a large portion of these costs. Predicting accident severity can help in the proactive
approach to road safety by identifying potential unsafe road conditions and taking well-informed
actions to reduce the number of severe accidents. This study investigates the effectiveness of the
Random Forest machine learning algorithm for predicting the severity of an accident. The model is
trained on a dataset of accident records from a large metropolitan area and evaluated using various
metrics. Hyperparameters and feature selection are optimized to improve the model's performance.
The results show that the Random Forest model is an effective tool for predicting accident severity with
an accuracy of over 80%. The study also identifies the top six most important variables in the model,
which include wind speed, pressure, humidity, visibility, clear conditions, and cloud cover. The fitted
model has an Area Under the Curve of 80%, a recall of 79.2%, a precision of 97.1%, and an F1 score
of 87.3%. These results suggest that the proposed model has higher performance in explaining the
target variable, which is the accident severity class. Overall, the study provides evidence that the
Random Forest model is a viable and reliable tool for predicting accident severity and can be used to
help reduce the number of fatalities and injuries due to road accidents in the United States.
This work discusses the study and development of a graphical interface and implementation of a machine learning model for vehicle traffic injury and fatality prediction for a specified date range and for a certain zip (US postal) code based on the New York City's (NYC) vehicle crash data set. While previous studies focused on accident causes, little insight has been offered into how such data may be utilized to forecast future incidents, Studies that have historically concentrated on certain road segment types, such as highways and other streets, and a specific geographic region, this study offers a citywide review of collisions. Using cutting-edge database and networking technology, a user-friendly interface was created to display vehicle crash series. Following this, a support vector machine learning model was built to evaluate the likelihood of an accident and the consequent injuries and deaths at the zip code level for all of NYC and to better mitigate such events. Using the visualization and prediction approach, the findings show that it is efficient and accurate. Aside from transportation experts and government policymakers, the machine learning approach deliver useful insights to the insurance business since it quantifies collision risk data collected at specific places.
Analysis of Machine Learning Algorithm with Road Accidents Data SetsDr. Amarjeet Singh
Beginning at now, street transport framework neglect to alter up to the exponential expansion in vehicular masses and to ascertaining the quickest driving courses and catastrophes inside observing differing traffic conditions is a critical issue right presently structures. To upset this issue is to explore the vehicle division dataset with bundle learning technique for finding the best street choice without calamity gauging by want aftereffects of best accuracy count by looking at oversaw AI figuring. In bits of information and AI, bundle strategies utilize diverse learning calculations to give indications of progress prudent execution. The assessment of dataset by facilitated AI technique (SMLT) to get two or three data takes after, factor perceiving proof, univariate evaluation, bivariate and multi-variate appraisal, missing worth medications and separate the information support, information cleaning/organizing and information perception will be done with everything taken into account given dataset. In addition, to look at and talk about the presentation of different AI figuring estimations from the given vehicle division dataset with assessment of GUI based street fiasco want by given attributes.
Study On Traffic Conlict At Unsignalized Intersection In Malaysia IOSR Journals
The research conducted is traffic conflict at unsignalized intersections . The purpose of this research
is to study accident data used as an identification of hazardous location leads to less accurate countermeasures.
It is because accidents are not always reported especially accident involving damage only and this situation can
reduce good comparative analysis. To overcome these lacks of accident data, many ways of employing nonaccident
data have been suggested. One of the ways using non-accident data is traffic conflicts, which is defined
as critical incidents not necessarily involving collisions. The traffic conflict technique was originally set up to
provide more reliable data and information of traffic problems at intersections which actually would replace the
unclear and incomplete recorded data accident. The conflict study was done at the selected unsignalized
intersection where types of traffic conflict can be identified and classified. Various road users involved in the
conflict at the unsignalized intersection were also observed. Then conflicts data captured were analyzed using
the computer program to observe for any conflicts at the intersections. The linear regression graph was used to
show the relationship between conflict and accident data where two different equations were derived from the
graph. This equation may be used to make a prediction for the relationship that might exist between those two
variables at another location.
Utilizing GIS to Develop a Non-Signalized Intersection Data Inventory for Saf...IJERA Editor
Roadway data inventories are being used across the nation to aid state Departments of Transportation (DOTs) in decision making. The high number of intersection and intersection related crashes suggest the need for intersection-specific data inventories that can be associated to crash occurrences to help make better safety decisions. Currently, limited time and resources are the biggest difficulties for execution of comprehensive intersection data inventories, but online resources exist that DOTs can leverage to capture desired data. Researchers from The University of Alabama developed an online method to collect intersection characteristics for non-signalized intersections along state routes using Google Maps and Google Street View, which was tied to an Alabama DOT maintained geographic information systems (GIS) node-link linear referencing method. A GIS-Based Intersection Data Inventory Web Portal was created to collect and record non-signalized intersection parameters. Thirty intersections of nine different intersection types were randomly selected from across the state, totaling 270 intersections. For each intersection, up to 78 parameters were collected, compliant with the Model Inventory of Roadway Elements (MIRE) schema. Using the web portal, the data parameters corresponding to an average intersection can be collected and catalogued into a database in approximately 10 minutes. The collection methodology and web portal function independently of the linear referencing method; therefore, the tool can be tailored and used by any state with spatial roadway data. Preliminary single variable analysis was performed, showing that there are relationships between individual intersection characteristics and crash frequency. Future work will investigate multivariate analysis and develop safety performance functions and crash modification factors.
Consequences of Road Traffic Accident in Nigeria: Time Series Approach Editor IJCATR
Road traffic accident in Nigeria is increasing at a worrying rate and has raised one of the country major concerns. We provided appropriate and suitable time series model for the consequences of road accident, the injured, killed and total casualty of the road accident in Nigeria. The most widely used conventional method, Autoregressive Integrated Moving Average (ARIMA) model of time series, also known as Box-Jenkins method is applied to yearly data on the consequences of road accident data in Nigeria from 1960-2013 to determine patterns of road traffic accident consequences; injured, killed and total casualty of the road accident along the Nigeria motorway. Appropriate models are developed for the accident consequences; injured, killed and total casualty. ARIMA (0; 2; 1) model is obtained for the injury and total casualty consequences, whilst ARIMA(1,2,2) model is obtained for the killed consequences, using the data from 1960-2011. The adequacy and the performance of the model are tested on the remaining data from 2012 to 2013. Seven years forecast are provided using the developed models and showed that road traffic accident consequences examined; injured, killed and total casualty would continue to increase on average.
Black spots identification on rural roads based on extremelearning machineIJECEIAES
Accident black spots are usually defined as road locations with a high risk of fatal accidents. A thorough analysis of these areas is essential to determine the real causes of mortality due to these accidents and can thus help anticipate the necessary decisions to be made to mitigate their effects. In this context, this study aims to develop a model for the identification, classification and analysis of black spots on roads in Morocco. These areas are first identified using extreme learning machine (ELM) algorithm, and then the infrastructure factors are analyzed by ordinal regression. The XGBoost model is adopted for weighted severity index (WSI) generation, which in turn generates the severity scores to be assigned to individual road segments. The latter are then classified into four classes by using a categorization approach (high, medium, low and safe). Finally, the bagging extreme learning machine is used to classify the severity of road segments according to infrastructures and environmental factors. Simulation results show that the proposed framework accurately and efficiently identified the black spots and outperformed the reputable competing models, especially in terms of accuracy 98.6%. In conclusion, the ordinal analysis revealed that pavement width, road curve type, shoulder width and position were the significant factors contributing to accidents on rural roads.
Experience Mazda Zoom Zoom Lifestyle and Culture by Visiting and joining the Official Mazda Community at http://www.MazdaCommunity.org for additional insight into the Zoom Zoom Lifestyle and special offers for Mazda Community Members. If you live in Arizona, check out CardinaleWay Mazda's eCommerce website at http://www.Cardinale-Way-Mazda.com
Comparing the two techniques Tripod Beta and Mort at a critical accident anal...IJERA Editor
Accidents are one of the leading causes of death and disability. Despite great efforts made to prevent accidents, there is still no coherent system to identify the root causes of industrial accidents. Selection of appropriate accident analysis techniques and their comparison can be useful in this regard. This research aimed to analyze a fatal accident in a power plant construction project using the two methods of MORT and Tripod-Beta, and the comparison of the analyses. First, the report of the selected accident was studied, and the accident was analyzed by the two methods of MORT and Tripod-Beta. The next step was followed by the comparison and assessment of the methods of MORT and Tripod-Beta with the measures of time, cost, training needs, the need for technical forces, the number of causes identified, quantifiable, and the need for software to conduct analysis. The Tripod-Beta accident analysis cost less and requires less time, and less technical experts. Thorough analysis of major accidents needs to identify all the possible causes of the incident, including human error and equipment failure. Therefore, the complimentary use of both techniques of industrial accident analysis is recommended.
A multi-objective evolutionary scheme for control points deployment in intell...IJECEIAES
One of the problems that hinder emergency in developing countries is the problem of monitoring a number of activities on inter-urban roadway networks. In the literature, the use of control points is proposed in the context of these countries in order to ensure efficient monitoring, by ensuring a good coverage while minimizing the installation costs as well as the number of accidents across these road networks. In this work, we propose an optimal deployment of these control points from several optimization methods based on some evolutionary multi-objective algorithms: the Non dominated sorting genetic algorithm-II (NSGA-II); the multi-objective particle swarm optimization (MOPSO); the strength pareto evolutionary algorithm-II (SPEA-II); and the pareto envelope based selection algorithm-II (PESA-II). We performed the tests and compared these deployments using pareto front and performance indicators like the spread and hypervolume and the inverted generational distance (IGD). The results obtained show that the NSGA-II method is the most adequate in the deployment of these control points.
Public Perception on Road Accidents: A Case Study of Mahasarakham City, Thailanddrboon
This study surveyed public perception on the causes of roadway accidents in Mahasarakam City area. There are 400 questionnaire respondents from different ages, occupations, and background. The majority of respondents never have an experience of confronting crash. Average perception of all respondents indicates that human is the most dominant factor causing road accidents. In addition, most respondents give high score to drunk driving as the most dangerous driving behavior. Respondents with different educational backgrounds seem to have agreeable perception on most of the items that cause street accidents.
Classification and Predication of Breast Cancer Risk Factors Using Id3theijes
Breast cancer considered as the first or second cause of death of women each year. There are many risk factors give an indication for the possibility of future cancer development. Therefore, a predictive system depends on risk factors will help in early detection of this disease. Here we used an ID3, by WEKA software, J48 algorithm to build a predictive system of a classifier tree to classify the dataset groups of risk factors according to its priority and gives a future likelihood to develop cancer, the results show that the family history is the most affecting factor in addition to other factors
PREDICTING ACCIDENT SEVERITY: AN ANALYSIS OF FACTORS AFFECTING ACCIDENT SEVER...IJCI JOURNAL
Road accidents have significant economic and societal costs, with a small number of severe accidents
accounting for a large portion of these costs. Predicting accident severity can help in the proactive
approach to road safety by identifying potential unsafe road conditions and taking well-informed
actions to reduce the number of severe accidents. This study investigates the effectiveness of the
Random Forest machine learning algorithm for predicting the severity of an accident. The model is
trained on a dataset of accident records from a large metropolitan area and evaluated using various
metrics. Hyperparameters and feature selection are optimized to improve the model's performance.
The results show that the Random Forest model is an effective tool for predicting accident severity with
an accuracy of over 80%. The study also identifies the top six most important variables in the model,
which include wind speed, pressure, humidity, visibility, clear conditions, and cloud cover. The fitted
model has an Area Under the Curve of 80%, a recall of 79.2%, a precision of 97.1%, and an F1 score
of 87.3%. These results suggest that the proposed model has higher performance in explaining the
target variable, which is the accident severity class. Overall, the study provides evidence that the
Random Forest model is a viable and reliable tool for predicting accident severity and can be used to
help reduce the number of fatalities and injuries due to road accidents in the United States.
This work discusses the study and development of a graphical interface and implementation of a machine learning model for vehicle traffic injury and fatality prediction for a specified date range and for a certain zip (US postal) code based on the New York City's (NYC) vehicle crash data set. While previous studies focused on accident causes, little insight has been offered into how such data may be utilized to forecast future incidents, Studies that have historically concentrated on certain road segment types, such as highways and other streets, and a specific geographic region, this study offers a citywide review of collisions. Using cutting-edge database and networking technology, a user-friendly interface was created to display vehicle crash series. Following this, a support vector machine learning model was built to evaluate the likelihood of an accident and the consequent injuries and deaths at the zip code level for all of NYC and to better mitigate such events. Using the visualization and prediction approach, the findings show that it is efficient and accurate. Aside from transportation experts and government policymakers, the machine learning approach deliver useful insights to the insurance business since it quantifies collision risk data collected at specific places.
Analysis of Machine Learning Algorithm with Road Accidents Data SetsDr. Amarjeet Singh
Beginning at now, street transport framework neglect to alter up to the exponential expansion in vehicular masses and to ascertaining the quickest driving courses and catastrophes inside observing differing traffic conditions is a critical issue right presently structures. To upset this issue is to explore the vehicle division dataset with bundle learning technique for finding the best street choice without calamity gauging by want aftereffects of best accuracy count by looking at oversaw AI figuring. In bits of information and AI, bundle strategies utilize diverse learning calculations to give indications of progress prudent execution. The assessment of dataset by facilitated AI technique (SMLT) to get two or three data takes after, factor perceiving proof, univariate evaluation, bivariate and multi-variate appraisal, missing worth medications and separate the information support, information cleaning/organizing and information perception will be done with everything taken into account given dataset. In addition, to look at and talk about the presentation of different AI figuring estimations from the given vehicle division dataset with assessment of GUI based street fiasco want by given attributes.
Study On Traffic Conlict At Unsignalized Intersection In Malaysia IOSR Journals
The research conducted is traffic conflict at unsignalized intersections . The purpose of this research
is to study accident data used as an identification of hazardous location leads to less accurate countermeasures.
It is because accidents are not always reported especially accident involving damage only and this situation can
reduce good comparative analysis. To overcome these lacks of accident data, many ways of employing nonaccident
data have been suggested. One of the ways using non-accident data is traffic conflicts, which is defined
as critical incidents not necessarily involving collisions. The traffic conflict technique was originally set up to
provide more reliable data and information of traffic problems at intersections which actually would replace the
unclear and incomplete recorded data accident. The conflict study was done at the selected unsignalized
intersection where types of traffic conflict can be identified and classified. Various road users involved in the
conflict at the unsignalized intersection were also observed. Then conflicts data captured were analyzed using
the computer program to observe for any conflicts at the intersections. The linear regression graph was used to
show the relationship between conflict and accident data where two different equations were derived from the
graph. This equation may be used to make a prediction for the relationship that might exist between those two
variables at another location.
Utilizing GIS to Develop a Non-Signalized Intersection Data Inventory for Saf...IJERA Editor
Roadway data inventories are being used across the nation to aid state Departments of Transportation (DOTs) in decision making. The high number of intersection and intersection related crashes suggest the need for intersection-specific data inventories that can be associated to crash occurrences to help make better safety decisions. Currently, limited time and resources are the biggest difficulties for execution of comprehensive intersection data inventories, but online resources exist that DOTs can leverage to capture desired data. Researchers from The University of Alabama developed an online method to collect intersection characteristics for non-signalized intersections along state routes using Google Maps and Google Street View, which was tied to an Alabama DOT maintained geographic information systems (GIS) node-link linear referencing method. A GIS-Based Intersection Data Inventory Web Portal was created to collect and record non-signalized intersection parameters. Thirty intersections of nine different intersection types were randomly selected from across the state, totaling 270 intersections. For each intersection, up to 78 parameters were collected, compliant with the Model Inventory of Roadway Elements (MIRE) schema. Using the web portal, the data parameters corresponding to an average intersection can be collected and catalogued into a database in approximately 10 minutes. The collection methodology and web portal function independently of the linear referencing method; therefore, the tool can be tailored and used by any state with spatial roadway data. Preliminary single variable analysis was performed, showing that there are relationships between individual intersection characteristics and crash frequency. Future work will investigate multivariate analysis and develop safety performance functions and crash modification factors.
Consequences of Road Traffic Accident in Nigeria: Time Series Approach Editor IJCATR
Road traffic accident in Nigeria is increasing at a worrying rate and has raised one of the country major concerns. We provided appropriate and suitable time series model for the consequences of road accident, the injured, killed and total casualty of the road accident in Nigeria. The most widely used conventional method, Autoregressive Integrated Moving Average (ARIMA) model of time series, also known as Box-Jenkins method is applied to yearly data on the consequences of road accident data in Nigeria from 1960-2013 to determine patterns of road traffic accident consequences; injured, killed and total casualty of the road accident along the Nigeria motorway. Appropriate models are developed for the accident consequences; injured, killed and total casualty. ARIMA (0; 2; 1) model is obtained for the injury and total casualty consequences, whilst ARIMA(1,2,2) model is obtained for the killed consequences, using the data from 1960-2011. The adequacy and the performance of the model are tested on the remaining data from 2012 to 2013. Seven years forecast are provided using the developed models and showed that road traffic accident consequences examined; injured, killed and total casualty would continue to increase on average.
Black spots identification on rural roads based on extremelearning machineIJECEIAES
Accident black spots are usually defined as road locations with a high risk of fatal accidents. A thorough analysis of these areas is essential to determine the real causes of mortality due to these accidents and can thus help anticipate the necessary decisions to be made to mitigate their effects. In this context, this study aims to develop a model for the identification, classification and analysis of black spots on roads in Morocco. These areas are first identified using extreme learning machine (ELM) algorithm, and then the infrastructure factors are analyzed by ordinal regression. The XGBoost model is adopted for weighted severity index (WSI) generation, which in turn generates the severity scores to be assigned to individual road segments. The latter are then classified into four classes by using a categorization approach (high, medium, low and safe). Finally, the bagging extreme learning machine is used to classify the severity of road segments according to infrastructures and environmental factors. Simulation results show that the proposed framework accurately and efficiently identified the black spots and outperformed the reputable competing models, especially in terms of accuracy 98.6%. In conclusion, the ordinal analysis revealed that pavement width, road curve type, shoulder width and position were the significant factors contributing to accidents on rural roads.
Accident Analysis At The Black Spot: A Case Studyiosrjce
Humans prefer comfort in every form. The same reason has prompted him to lay the roads and invent
motor vehicles. This is the era we are seeing very huge number of vehicles on the roads. But to his dismay, with this
comfortless, there came the problem of accidents due to increase in traffic volume. The increased human misery and
serious economic loss caused by road accidents demand the attention of the society and call for the solution of this
problem. The causes for accidents are many. It may be either due to the fault of the driver or vehicular defect, tough
weather condition or due to improper road design and many more. Precisely, if accidents occur frequently at a
particular road stretch then, the location is coined as Black Spot. In the present work, an attempt has been made to
evaluate the effects of highway geometrics and speed parameters in increased accident rates at the black spot. The
black spot of our interest is Busthenahalli bypass (spot-A) on National Highway-48 between Bangalore and
Mangalore, Karnataka, India. The mixed traffic condition prevailing on the road and the inadequate geometric
conditions on field create the problem of increased accident rates. The regression equation for the condition
prevailing has been found for the location under consideration which represents the variation of accident rate with
age of the driver, rise and fall, pavement width, Stopping Sight Distance for operating speed and regulating speed and
Annual Daily Traffic(ADT).
The Effects of Vehicle Speeds on Accident Frequency within Settlements along ...IJMER
Literature provides overwhelming evidence that a strong relationship exist between
vehicle speed and accident risk, and an outcome severity in the event of an accident. Excessive speed
is said to be a major causal factor of road accidents on trunk roads; contributing 60% of all vehicular
accidents. However, speed rationalization measures implemented on a number of trunk roads in
Ghana have realized very little success. This study therefore investigated the effects of vehicle speeds
on accident frequency within settlements along trunk roads. Data was collected on accidents, vehicle
speeds and other road and environment-related features for ninety-nine (99) settlements delineated
from four (4) trunk roads. Correlation analysis was employed to establish useful relationships and
provided insight into the contributions of relevant road and environmental-related variables to the
occurrence of road traffic accidents. Using the Negative Binomial error structure within the
Generalized Linear Model framework, core (flow-based) models were formulated based on accident
data and exposure variables (vehicle mileage, daily pedestrian flow and travel speed). Incremental
addition of relevant explanatory variables further expanded the core models into comprehensive
models. Findings indicate the main risk factors are number of accesses, daily pedestrian flow and
total vehicle kilometers driven, as vehicle speed did not appear to influence the occurrence of road
traffic accidents within settlements along trunk roads. In settlement corridors, mitigating accident
risks should not focus only on traffic calming but rather on measures that reduce pedestrian and
vehicular conflict situations as well as improve conspicuity around junctions
Predictive analysis of traffic violationsDarshak Mehta
The paper produced results based on traffic violation data that was updated daily in Montgomery County in the USA. Using the data set, we analyzed the effect of a traffic violation on traffic accidents by using various big data analysis techniques. In particular, three modeling hypotheses have been developed based on initial data understanding and the performed 5 visualizations. The required data preprocessing has been performed to address these hypotheses using 3 models with different algorithms.
Similar to Analysis of Roadway Fatal Accidents using Ensemble-based Meta-Classifiers (20)
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
10th International Conference on Artificial Intelligence and Applications (AI...gerogepatton
10th International Conference on Artificial Intelligence and Applications (AI 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Artificial Intelligence and its applications. The Conference looks for significant contributions to all major fields of the Artificial Intelligence, Soft Computing in theoretical and practical aspects. The aim of the Conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
May 2024 - Top 10 Read Articles in Artificial Intelligence and Applications (...gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)gerogepatton
3rd International Conference on Artificial Intelligence Advances (AIAD 2024) will act as a major forum for the presentation of innovative ideas, approaches, developments, and research projects in the area advanced Artificial Intelligence. It will also serve to facilitate the exchange of information between researchers and industry professionals to discuss the latest issues and advancement in the research area. Core areas of AI and advanced multi-disciplinary and its applications will be covered during the conferences.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
Information Extraction from Product Labels: A Machine Vision Approachgerogepatton
This research tackles the challenge of manual data extraction from product labels by employing a blend of
computer vision and Natural Language Processing (NLP). We introduce an enhanced model that combines
Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a Convolutional
Recurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined by
incorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition
(OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open Food
Facts API (Application Programming Interface) for database population and text-only label prediction.
The CRNN model is trained on encoded labels and evaluated for accuracy on a dedicated test set.
Importantly, our approach enables visually impaired individuals to access essential information on
product labels, such as directions and ingredients. Overall, the study highlights the efficacy of deep
learning and OCR in automating label extraction and recognition.
10th International Conference on Artificial Intelligence and Applications (AI...gerogepatton
10th International Conference on Artificial Intelligence and Applications (AI 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Artificial Intelligence and its applications. The Conference looks for significant contributions to all major fields of the Artificial Intelligence, Soft Computing in theoretical and practical aspects. The aim of the Conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
Research on Fuzzy C- Clustering Recursive Genetic Algorithm based on Cloud Co...gerogepatton
Aiming at the problems of poor local search ability and precocious convergence of fuzzy C-cluster
recursive genetic algorithm (FOLD++), a new fuzzy C-cluster recursive genetic algorithm based on
Bayesian function adaptation search (TS) was proposed by incorporating the idea of Bayesian function
adaptation search into fuzzy C-cluster recursive genetic algorithm. The new algorithm combines the
advantages of FOLD++ and TS. In the early stage of optimization, fuzzy C-cluster recursive genetic
algorithm is used to get a good initial value, and the individual extreme value pbest is put into Bayesian
function adaptation table. In the late stage of optimization, when the searching ability of fuzzy C-cluster
recursive genetic is weakened, the short term memory function of Bayesian function adaptation table in
Bayesian function adaptation search algorithm is utilized. Make it jump out of the local optimal solution,
and allow bad solutions to be accepted during the search. The improved algorithm is applied to function
optimization, and the simulation results show that the calculation accuracy and stability of the algorithm
are improved, and the effectiveness of the improved algorithm is verified
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
10th International Conference on Artificial Intelligence and Soft Computing (...gerogepatton
10th International Conference on Artificial Intelligence and Soft Computing (AIS 2024) will
provide an excellent international forum for sharing knowledge and results in theory, methodology, and
applications of Artificial Intelligence, Soft Computing. The Conference looks for significant
contributions to all major fields of the Artificial Intelligence, Soft Computing in theoretical and practical
aspects. The aim of the Conference is to provide a platform to the researchers and practitioners from
both academia as well as industry to meet and share cutting-edge development in the field.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
Employee attrition refers to the decrease in staff numbers within an organization due to various reasons.
As it has a negative impact on long-term growth objectives and workplace productivity, firms have
recognized it as a significant concern. To address this issue, organizations are increasingly turning to
machine-learning approaches to forecast employee attrition rates. This topic has gained significant
attention from researchers, especially in recent times. Several studies have applied various machinelearning methods to predict employee attrition, producing different resultsdepending on the employed
methods, factors, and datasets. However, there has been no comprehensive comparative review of multiple
studies applying machine-learning models to predict employee attrition to date. Therefore, this study aims
to fill this gap by providing an overview of research conducted on applying machine learning to predict
employee attrition from 2019 to February 2024. A literature review of relevant studies was conducted,
summarized, and classified. Most studies agree on conducting comparative experiments with multiple
predictive models to determine the most effective one.From this literature survey, the RF algorithm and
XGB ensemble method are repeatedly the best-performing, outperforming many other algorithms.
Additionally, the application of deep learning to employee attrition prediction issues also shows promise.
While there are discrepancies in the datasets used in previous studies, it is notable that the dataset
provided by IBM is the most widely utilized. This study serves as a concise review for new researchers,
facilitating their understanding of the primary techniques employed in predicting employee attrition and
highlighting recent research trends in this field. Furthermore, it provides organizations with insight into
the prominent factors affecting employee attrition, as identified by studies, enabling them to implement
solutions aimed at reducing attrition rates.
10th International Conference on Artificial Intelligence and Applications (AI...gerogepatton
10th International Conference on Artificial Intelligence and Applications (AIFU 2024) is a forum for presenting new advances and research results in the fields of Artificial Intelligence. The conference will bring together leading researchers, engineers and scientists in the domain of interest from around the world. The scope of the conference covers all theoretical and practical aspects of the Artificial Intelligence.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
THE TRANSFORMATION RISK-BENEFIT MODEL OF ARTIFICIAL INTELLIGENCE:BALANCING RI...gerogepatton
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an
in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being
used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with
theories and models reviewed and expanded constructs, the writers propose a new framework called “The
Transformation Risk-Benefit Model of Artificial Intelligence” to address the increasing fears and levels of
AIrisk. Using the model characteristics, the article emphasizes practical and innovative solutions where
benefitsoutweigh risks and three use cases in healthcare, climate change/environment and cyber security to
illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational
model.
13th International Conference on Software Engineering and Applications (SEA 2...gerogepatton
13th International Conference on Software Engineering and Applications (SEA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Software Engineering and Applications. The goal of this conference is to bring together researchers and practitioners from academia and industry to focus on understanding Modern software engineering concepts and establishing new collaborations in these areas.
International Journal of Artificial Intelligence & Applications (IJAIA)gerogepatton
The International Journal of Artificial Intelligence & Applications (IJAIA) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Artificial Intelligence & Applications (IJAIA). It is an international journal intended for professionals and researchers in all fields of AI for researchers, programmers, and software and hardware manufacturers. The journal also aims to publish new attempts in the form of special issues on emerging areas in Artificial Intelligence and applications.
AN IMPROVED MT5 MODEL FOR CHINESE TEXT SUMMARY GENERATIONgerogepatton
Complicated policy texts require a lot of effort to read, so there is a need for intelligent interpretation of
Chinese policies. To better solve the Chinese Text Summarization task, this paper utilized the mT5 model
as the core framework and initial weights. Additionally, In addition, this paper reduced the model size
through parameter clipping, used the Gap Sentence Generation (GSG) method as unsupervised method,
and improved the Chinese tokenizer. After training on a meticulously processed 30GB Chinese training
corpus, the paper developed the enhanced mT5-GSG model. Then, when fine-tuning the Chinese Policy
text, this paper chose the idea of “Dropout Twice”, and innovatively combined the probability distribution
of the two Dropouts through the Wasserstein distance. Experimental results indicate that the proposed
model achieved Rouge-1, Rouge-2, and Rouge-L scores of 56.13%, 45.76%, and 56.41% respectively on
the Chinese policy text summarization dataset.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Analysis of Roadway Fatal Accidents using Ensemble-based Meta-Classifiers
1. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
DOI: 10.5121/ijaia.2020.11408 101
ANALYSIS OF ROADWAY FATAL ACCIDENTS USING
ENSEMBLE-BASED META-CLASSIFIERS
Waheeda Almayyan
Computer Information Department, Collage of Business Studies, PAAET, Kuwait
ABSTRACT
In the past decades, a lot of effort has been put into roadway traffic safety. With the help of data mining,
the analysis of roadway traffic data is much needed to understand the factors related to fatal accidents.
This paper analyses Fatality Analysis Reporting System (FARS) dataset using several data mining
algorithms. Here, we compare the performance of four meta-classifiers and four data-oriented techniques
known for their ability to handle imbalanced datasets, entirely based on Random Forest classifier. Also,
we study the effect of applying several feature selection algorithms including PSO, Cuckoo, Bat and Tabu
on improving the accuracy and efficiency of classification. The empirical results show that the Threshold
selector meta-classifier combined with over-sampling techniques results were very satisfactory. In this
regard, the proposed technique has gained a mean overall Accuracy of 91% and a Balanced Accuracy that
varies between 96% to 99% using 7-15 features instead of 50 original features.
KEYWORDS
Data mining; roadway traffic safety; Imbalanced Data; Meta-classifiers; Ensemble classifier; Data
Sampling.
1. INTRODUCTION
Nowadays, traffic road accidents are considered a major problem that confronts people health all
around the world. The figures from World Health Organization about global road accident
fatalities showed that fatalities were approximately 1.35 million people annually worldwide [1].
Road traffic injuries are now the leading killer of people aged 5-29 years. The cost of traffic
accidents is estimated to be 3% of Gross Domestic Product worldwide and more than 90% of
road traffic deaths that occurs in low-and middle-income countries. Analysing the severity of
accidents is the key of improving the road safety [2]. Recent roadway safety studies have focused
on identifying the contributing factors that impact accident severity. Nevertheless, several risk
causes are waiting to be discovered or analysed [3].
A number of publications have examined the use of data mining methods in many aspects [4].
Data mining can be defined as a process that uses a variety of data analysis tools to discover
patterns and relationships in data that may be used to make valid predictions. Data mining tools
are based on highly automated search procedures. Basically, it overcomes the weaknesses of
traditional techniques that operate under the assumption that data are distributed normally or
according to another distribution, which can be incorrect and may be difficult to validate. Data
mining methods intense rely on use of computing power. One of its strength points, it is able to
handle categorical variables with a large number of categories, incomplete and noisy datasets [4].
Considerably less researchers considered feature selection compared to the growth achieved in
resampling techniques [5]. Under imbalanced scenarios, minority class samples can easily be
2. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
102
discarded as noise. However, such risk can be reduced if the irrelevant features in the feature
space are removed.
The literature review indicates a great interest in adapting data mining algorithms in analysing
road accidents data [8-18]. Our main task is to develop machine learning based intelligent model
that could classify the severity of injuries in FARS dataset more accurately. The Fatal Accidents
Dataset contains all serious accidents that happened on public roads in 2007 reported to the
National Highway Transportation Safety Administration (NHTSA) [6]. The dataset can be
downloaded from California Polytechnic State University and all data originally came from
FARS. The dataset contains 37,248 records and 55 attributes. The data description can be found
in the document FARS Analytic Reference Guide during the years 1975-2007 [7]. The objective
of this study is twofold: firstly, to propose an algorithm for extracting factors that are
significantly related to the car accidents according to their injury severity. Secondly, to
investigate the impact of data-oriented/re-sampling techniques methods on enhancing the
classification performance the imbalanced dataset.
The remaining parts of the paper are organized as follows. Section 2 summarizes the recent
research related work that addresses the problem of roadway traffic safety, followed by sections
3 which include the methodology, the applied meta-classifiers and the applied feature selection
techniques. The evaluation environment, and performance metrics along with the evaluation of
the different models and experimental results are discussed in Section 4. Finally, conclusions are
presented in Section 5.
2. RELATED WORKS
In 1987, individual states in the USA were allowed to raise speed limits on rural freeways from
55 to 65 mph. Ossiander and Cummings analysed the effect of the increased speed limits through
designing an ecological study of crashes and vehicle speeds on Washington State freeways from
1974 through 1994 [8]. They concluded that the incidence of fatal crashes more than doubled
after 1987 and the death rate go up by 27% compared to increase in 10% in the states that did not
increase the speed limit. Solaiman et. al. developed an Internet-based prototype GIS and Road
Accident View System for automated road accident analysis and visualization [9]. The suggested
system had the capabilities to perform query over accident information, trend analysis, statistical
analysis, color-coded mapping and other accident information displayed within the web-based
environment. In addition, it could predict the possible dangerous accident sites from the data
gathered using the map API the system.
Researchers in [10] used partition-based and density-based clustering to analyse the road
accidents data. They first cluster the accident data using K-modes algorithm and then
association-rule mining technique is applied to identify the correlation among various sets of
attributes in which an accident may occur for each cluster. Chang et.al in [11] applied the
classification and regression tree model (CART) to analyse the Taiwan traffic accidents data in
2001. It studied the relationship between fatal injuries and driver/vehicle, highway/environment
and accident variables. The results indicated that the most important variable related to crash
severity is the vehicle type. They identified that pedestrians, motorcycle and bicycle riders have
higher risks of being injured than other types of vehicle drivers in traffic accidents. Krishnaveni
and Hemalatha applied several classification models to predict the injury severities in traffic
accidents that occurred in Hong Kong during 2008 [12]. Naive Bayes, AdaBoostM1, PART
Rule, J48 and Random Forest classifiers were selected for classifying the type of injury severity.
Meanwhile, Genetic Algorithm was applied to reduce the dimensionality of the dataset. The final
results showed that Random Forest outperformed the other four algorithms.
3. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
103
Kwon et al. applied several classification algorithms for ranking main factors that cause
accidents [13]. With a binary logistic regression model used as the basis for the comparisons,
they applied the decision tree classifier and Naive Bayes algorithms. Bahiru et al. built a decision
tree based on J48, ID3, CART and Naïve Bayes classifiers to model the severity of injury [14].
The experimental result showed that the accuracy of J48 classifier is higher than others models.
The author believed that closely studying the road traffic accident data with the accident severity
and time components of the accident can help to identify any hidden temporal patterns. Silva and
Saraee proposed novel approach of combining Decision Tree algorithm and Time-Series
Calendar Heatmaps technique to extract the knowledge with time factors from the accident
datasets that happened on a region in North of England [15]. Based on the decision tree models
and evaluation measures, they noticed that there is a correlation between hour and month of the
accident and the severity of the accident.
Researchers in [16], noticed that traffic accidents datasets are usually imbalanced. Therefore,
they investigated under-sampling, oversampling and a mix technique that combines both
techniques. Different Bayes classifiers were used to analyse imbalanced and balanced traffic
crashes datasets in Jordan for three years. The results indicated that the most influencing
parameters were the number of vehicles involved, accident pattern, number of directions,
accident type, lighting, surface condition, and speed limit. Moreover, they noticed that using the
balanced data sets using oversampling technique with Bayesian networks improved classification
results. Li et al. [17] applied Apriori algorithm, Naive Bayes classifier and k-means clustering
algorithms on the FARS Fatal Accident dataset to study the relationship between injury severity
and other attributes such as collision manner, light condition, drunk driver weather conditions.
The analysis result suggested that the human factors like drunk or not and the collision type have
a stronger effect on the fatality rate more than environmental factors like roadway surface,
weather, and light conditions. Pakgohar et al. [18] explored the role of human factors on
incidence and severity of road crashes in Iran by employing descriptive analysis; Logistic
Regression, Classification and Regression Tree. The study indicated the human responsibility on
the occurrence of fatal accidents. The result of the study recommends the important role of
issuing ‘Driving License’ and using ‘Safety Belt’ safety policies which might lower the severity
of injuries in traffic accidents in Iran.
3. METHODOLOGY
First of all, as a pre-processing step we calculated injury_severity variable according to number
of casualties and the number of persons involved in the accident and binned to two categories,
high and low severity. This variable is adapted from [17] and it represents the percentage of
severity in accidents and calculated as
Injury_severity = FATALS / PERSONS (1)
Where FATALS is the number of fatalities and PERSONS is the number of persons involved in
the accident. Injury_severity is referred as “class” in the analysis. And according to this, the
number of samples in the FARS dataset are 25545 instances of minor car accidents and 11703
instances for serious accidents. So, the current imbalance ratio in FARS dataset is 2:1 for
majority and minority classes, respectively. Therefore, we decided to tackle this issue with more
investigation.
Several studies give equal importance to all classes, assuming that datasets are balanced and this
often results in poor classification results [19-23]. The imbalance problem happens when the
numbers of instances of one class outnumber the others. This case is commonly known in the
field of real datasets which need to be usually handled [19,20]. Schierz et al. [21] compared four
4. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
104
Cost-sensitive classifiers, namely Naïve Bayes, SVM, Random Forest and C4.5 decision tree to
classify pharmaceutical data. They noticed that SVM and C4.5 decision tree classifiers have
performed relatively well. Other studies discussed adapting data-oriented/resampling techniques
as a solution for the imbalance problem. Data level resampling is one of the many ways to handle
class imbalance problem. In a recent study, Singh proposed a data level resampling method to
improve learning from class imbalanced datasets in health applications [23]. The learning
process on different datasets has improved by incorporating the distribution structure of minority
class samples to generate new data samples using deep learning neural networks.
The primary objective of this research work is aimed at enhancing FARS dataset classification
with exploring minimum number of features which are significantly related to the car accidents.
The primary focus is on investigating the impact of data-oriented/re-sampling techniques
methods on enhancing the classification performance over the imbalanced dataset. In this study
we will evaluate the performance of four meta-classifiers and four data-oriented techniques to
handle the imbalance ratio.
3.1. Meta Classifiers
Numerous classification solutions have been suggested in literature to handle imbalanced
datasets. Several studies investigated the performance of classifiers which should help in
choosing the appropriate classification method. Ensemble learning algorithms which utilizes
ensembles of classifiers such as neural networks, Random Forest, bagging and boosting, have
received an increasing interest for of their ability to deliver an accurate prediction and robust to
noise and outliners than single classifiers [19,24]. The basic idea behind ensembled classifiers is
based upon the premise that a group of classifiers can perform better than an individual classifier.
In this research, we compared the performance of four different meta-classifiers known for their
ability to handle imbalanced datasets, explicitly, Bagging, Cost-sensitive, MetaCost and
Threshold Selection classifiers with Random Forest as base-classifier. Random Forest consists of
a combination of individual base classifiers where each tree is generated using a random vector
sampled independently from the classification input vector to enable a much faster construction
of trees. In 2001, Breiman proposed a promising tree-based ensemble classifier based on a
combination tree of predictors such that each tree depends on the values of a random vector
sampled independently and with the same distribution for all trees and named it Random Forest
[25]. Random Forest is as one of the best learning methods as it is more robust and can achieve
better performances than single decision trees.
3.1.1. Bagging Classifier
Leo Breiman proposed Bagging or Bootstrap aggregating as a meta-algorithm based on ensemble
of decision trees to improve classification and regression models in terms of stability and
classification accuracy in machine learning [25]. Additionally, it decreases the variance and
reduces overfitting. This technique can be used for any type of model such as NN, although it is
most applied to decision tree models [26].
In Bagging diversity is obtained by using bootstrapped replicas of the original training set, where
different training datasets are randomly drawn with replacement. Consequently, with each
training data replica a decision tree is built based on the standard approach. So, each tree can be
defined by a different set of variables, nodes and leaves. Finally, their predictions are combined
to obtain the final result. The final results can be obtained in regression cases by averaging votes
and by combining the outputs of models during classification [27].
5. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
105
3.1.2. Cost-Sensitive Classifier
Cost-sensitive classification learning is a type of learning in data mining that considers the
penalty of misclassification cost into consideration [28]. In Cost-sensitive learning process, the
objective is to develop a hypothesis that seek to minimize the high cost errors and the total
misclassification cost. Therefore, a Cost-sensitive classification technique takes the cost matrix
into consideration during model building and generates a model that generate the minimum
expected cost [29].
For two-class problem, let C(Min,Maj) denote the cost of misclassifying a majority class instance
as minority instance and C(Maj,Min) as the cost of misclassifying a minority class instance as
majority instance. When dealing with the class imbalance problem, the cost of misclassifying
minority examples is higher than the cost of misclassifying majority, examples (C(Maj,Min) >
C(Min,Maj)) and there is no penalty for correct classification (i.e., C(Maj,Maj)=C(Min, Min)=
0).
3.1.3. Threshold-Based Selector Classifier
Threshold Selector is a meta-classifier that sets a threshold on the probability output of a base-
classifier. Threshold adjustment for the classifier’s decision is one of the methods used for
dealing with imbalanced datasets [30]. A meta-classifier selects a mid-point threshold on the
probability output by a classifier. The midpoint threshold is set so that a given performance
measure is optimized [31]. By default, the probability threshold is assigned to 0.5, i.e. if an
instance is attributed with a probability of equal or less than 0.5, it is classified as negative for the
respective class, while if it is greater than 0.5, the instance is classified as positive.
Performance is measured either on the training data, a hold-out set or using cross-validation. In
addition, the probabilities returned by the base learner can have their range expanded so that the
output probabilities will reside between 0 and 1, which is useful if the scheme normally produces
probabilities in a very narrow range. For our experiments, the optimal threshold was selected
automatically by the meta-classifier by applying internal five-fold cross validation to optimize
the threshold according to 𝑓-measure (Eq. 13), as measure of a model’s accuracy [32].
3.1.4. MetaCost Classifier
MetaCost procedure is based on relabelling the classes of the training examples, and then
employs a modified training set to produce the final model. MetaCost depends on an internal
Cost-sensitive classifier in order to relabel classes of training examples. Nevertheless the study
by Domingos made no comparison between MetaCost’s final model and the internal Cost-
sensitive classifier on which MetaCost depends [33]. This comparison is worth making as it is
credible that the internal Cost-sensitive classifier may outperform the final model without the
additional computation required to derive the final model. Boosting [34,35] can be effective and
can be better than bagging [36] in minimizing errors as it uses bagging internally. Using a
boosting procedure in MetaCost may improve MetaCost’s performance. This is the reason why
we choose to use boosting procedures in MetaCost in this paper. This meta-classifier makes its
base-classifier Cost-sensitive using the method specified in [33]. This implementation uses all
bagging iterations when reclassifying training data.
3.2. Feature Selection Algorithms
Feature Selection is a challenging machine learning-related task that aims at reducing the number
of features by removing irrelevant, redundant and noisy data while maintaining an acceptable
6. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
106
level of classification accuracy. Essentially, the feature space is explored to reduce the feature
space and prepare the conditions for the classification step. The performance of the selected
dimension reduction techniques is examined to find the most effective one.
3.2.1. Tabu Search
Tabu Search is a memory-based metaheuristic algorithm proposed by Glover in 1986 to solve
combinatorial optimization problems [37,38]. Since then, Tabu Search has been successfully
applied in other feature selection problems [39]. Tabu Search is a local neighbourhood search
algorithm that simulates the optimal characteristics of human memory functions. Tabu Search
involves a local search combined with a tabu mechanism.
It starts with an initial feasible solution X’ among the neighbourhood solutions, where is
the set of feasible solutions, and at each iteration, the algorithm searches the neighbourhood of
the best solution N(X) to obtain a new one with an improved functional value. A solution X’
N(X) can be reached from X in two cases, X’ is not included in the Tabu list; and X’ is
included in the Tabu list, but it satisfies the aspiration criterion [40]. Surely, if the new solution
𝑋’ is superior to 𝑋best, the value of 𝑋best is overridden. To avoid cycling, solutions that were
previously visited are declared forbidden or tabu for a certain number of iterations and this surely
improve the performance of the local search. Then, the neighbourhood search is resumed based on
the new feasible solution 𝑋’. This procedure is iteratively executed until the stopping criteria is
met. After the iterative process has terminated, the current best solution so far 𝑋best is considered
the final optimal solution provided by the Tabu Search method [41].
3.2.2. Particle Swarm Optimization Search
The particle swarm optimization (PSO) is a population-based stochastic optimization technique
introduced in 1995 by Kennedy and Eberhart [42]. In PSO, a possible candidate solution is
encoded as a finite-length string called a particle pi in the search space. All of the particles make
use of its own memory and knowledge gained by the swarm as a whole to find the best solution.
With the purpose of discovering the optimal solution, each particle adjusts its searching direction
according to two features, its own best previous experience (pbest) and the best experience of its
companions flying experience (gbest).
Each particle is moving around the n-dimensional search space S with objective function f : S
n . Each particle has a position xi ,t fitness function f(xi,t) and ‘‘flies’’ through the problem
space with a velocity vi,t . A new position z1 S is called better than z2 S iff f (z1 ) f (z2 ).
Particles evolve simultaneously based on knowledge shared with neighbouring particles; they
make use of their own memory and knowledge gained by the swarm as a whole to find the best
solution. The best search space position particle i has visited until iteration t is its previous
experience pbest. To each particle, a subset of all particles is assigned as its neighbourhood. The
best previous experience of all neighbours of particle i is called gbest. Each particle additionally
keeps a fraction of its old velocity. The particle updates its velocity and position with the
following equations in continuous PSO [43]:
(2)
(3)
7. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
107
3.2.3. Cuckoo Search
Cuckoo search is a recently developed population-based metaheuristic algorithm developed by
Xin-She Yang and Suash Deb in 2009 [44]. Since then it has been used as a successful adaptive
search strategy for solving optimization problems. Recent studies showed that Cuckoo search
algorithm is computationally efficient and easy to implement with less parameters [45].
The basic idea behind the Cuckoo search algorithm is derived from the brood parasitism of some
cuckoo species. These species use the nests of other host birds to lay their eggs in that look like
the pattern and color of the native eggs to reduce the probability of discovering them and rely on
these birds for accommodating their eggs. Sometimes, some of host birds discover and throw the
alien eggs away or simply abandon their nests and build a new one in another place. For the
cuckoo search algorithm, each egg in a nest represents a solution, and a cuckoo egg represents a
new solution. The goal is to employ the new and potentially better solutions (cuckoos) to replace
a not-so-good solution in the nests [45]. Hence, the Cuckoo search algorithm is more efficient in
exploring the search space as it will make sure the algorithm will not fall into a local optimum.
3.2.4. Bat Search
The Bat Algorithm is a meta heuristic Swarm Intelligence algorithm proposed in 2010 by Yang
[46], who was inspired by the abilities of bats in searching for their prey and discriminating
different types of insects and obstacles even at complete darkness. Bats emit loud sound pulses
that help them detect target and avoid obstacles. In order to transpose this behaviour into an
intelligent algorithm, the author states three hypothesis. First all bats will use echolocation to
identify its prey. Secondly, all bats fly randomly and their trajectory is characterized by their
internal encoded frequency (freq), velocity (v) and position in space (x). At each iteration of the
algorithm these three variables are updated as:
freqi = freqmin + (freqmax − freqmin) · β (4)
vit
= vit−1
+ (xit−1
− x_bestj) · freqi (5)
xit
= xit−1
+ vit
(6)
where β ∈ [0,1] is a random vector drawn from a uniform distribution. Moreover, as the bat
attains a position closer to its target then, it will decrease its loudness (Ai) and increase its rate of
the pulse emission (ri) as follows:
Ait+1
= α · Ait
(7)
rit+1
= ri0
· [1 − e−γ·t
] (8)
where α (0 < α < 1) and γ (γ > 0) are constants. Finally, the author assumes that the
loudness will vary from a large value to a minimum one.
4. EXPERIMENTS AND RESULTS
To estimate the generalized error of our method, we have applied 10-fold cross-validation to
avoid overfitting the learned models. For a highly imbalanced datasets, accuracy may be
confusing. Therefore, we considered more appropriate performance measures to compare
different classifiers for their ability to handle imbalanced datasets such as balanced accuracy
8. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
108
which considers both sensitivity and specificity. Hence, if two models delivered the same
sensitivity value, the model that demonstrated higher balanced accuracy will be prioritized for
selection. Specificity, Sensitivity, Accuracy, Balanced Accuracy, Precision, 𝑓-measure and MCC
are the quality metrics for assessing the performance of the learning performance of each
classifier before and after data resampling [47]. Their formulae are shown in Equations 9-15.
Where TP: true positives; TN: true negatives; FP: false positives; FN: false negatives.
One of the interesting aspects in data mining is to build computational models with abilities to
extract hidden knowledge using data mining schemes. Regarding model settings, not all
classifiers require parameter optimization. For example, ClassBalancer automatically reassigns
weights to the instances in the dataset such that each class has the same total weight [48,49],
therefore, it does not require adjustment. For Bagging, the only parameter to optimize would be
the number of bags. In our case, the number of bags was adjusted to 100. For Cost-sensitive
Classifier and MetaCost, the cost for misclassification was initially applied in accordance with
the imbalance ratio, which, in case it did not provide a sensitivity of at least 0.5, was further
increased to arrive at the final model. All the meta-learning methods evaluated in this study were
implemented in the WEKA software suite, which is a Java based open source data-mining tool
[48]. The number of trees for Random Forest was arbitrarily set to 100, since it has been shown
that the optimal number of trees is usually between 64 and 128, and increasing the number of
trees does not necessarily improve the model’s performance [27,49].
To determine which data-oriented technique is the most suitable for our FARS dataset and before
executing the feature selection, we start with performing experiments over the class-imbalanced
dataset. The performance values of the meta-classifiers and the suggested sampled dataset are
also illustrated in Table 1 to facilitate the comparison with the investigated methods. Observing
the values of Specificity, Balanced Accuracy and Precision it is clear that the Threshold selector
algorithm has the highest values where it accomplished 94%, 97% and 97%. Whereas, the values
of Sensitivity, Accuracy, 𝑓-measure and MCC showed that Bagging, Cost-sensitive and
MetaCost outperformed the Threshold selector algorithm. We have noticed that Bagging, Cost-
sensitive and MetaCost classifiers recorded a comparable performance in most cases. In general,
all average Balanced Accuracy were compared and provided improvement in the results with
average Accuracies of 90%.
9. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
109
In our experiments, three over-sampling techniques, explicitly Synthetic Minority Oversampling
(SMOTE), oversampling minority and Class Balancer together with the Under-sampling method,
were applied. SMOTE technique resamples the dataset by applying the over-sampled the
minority class. While, Over-sampling minority technique achieves oversampling of the minority
class, rather than under-sampling of the majority class, so that both classes have the same
number of instances. Under-sampling technique produces a random subsample of a dataset to
implement under-sampling of the majority class. These over-sampling techniques work through
increasing the number of examples in the minority class to balance the distribution of the data
sets and improve the detection rate of the minority class. Whereas Class Balancer reweights the
instances in the data so that each class has the same total weight. Regarding the data-oriented
techniques results, obviously oversampling the minority class is the best algorithm in all
measures. It accomplished 93%, 99%, 96% 100%, 99% 96% and 89% in terms of Sensitivity,
Specificity, Accuracy, Balanced Accuracy, Precision, 𝑓-measure and MCC. Here we can notice
that data- oriented techniques have helped in improvement of prediction performance.
Table 1. Classification results of FARS Dataset
Model Sensitivity Specificity Accuracy Balanced
Accuracy
Precision 𝑓-
measure
MCC
Bagging 0.891 0.829 0.872 0.914 0.919 0.905 0.698
Cost-sensitive 0.886 0.828 0.868 0.914 0.918 0.902 0.689
MetaCost 0.882 0.848 0.871 0.924 0.926 0.904 0.694
Threshold
selector
0.809 0.943 0.851 0.971 0.968 0.882 0.645
SMOTE 0.868 0.954 0.911 0.977 0.949 0.907 0.790
Oversampling
minority
0.928 0.988 0.958 0.994 0.987 0.957 0.890
Class
Balancer
0.852 0.920 0.886 0.960 0.914 0.882 0.747
Under
sampling
0.894 0.842 0.878 0.921 0.925 0.910 0.711
Feature selection techniques are meant to identify a set of crucial features that have maximum
relevancy for target classes and minimum redundancy with other features in the dataset at the
same time. The next step is to investigate the potential of using feature selection techniques. At
the end of this step a subset of features is chosen for the next round. The optimal features by the
PSO, Cuckoo, Bat and Tabu techniques are listed in Table 2. It is noteworthy that the number of
features has remarkably reduced, compared with original dataset. In this phase we reduced the
size of FARS features from 50 to only 7-15 features.
Table 2. Feature selection results
Feature Selection Algorithm Features No. Features Details
PSO 15 MINUTE VE_TOTAL
PEDS HARM_EV
MAN_COLL
REL_ROAD SP_LIMIT
ALIGNMNT
C_M_ZONE NOT_MIN
HOSP_HR SCH_BUS
CF1 DRUNK_DR
VE_FORMS
10. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
110
Cuckoo 13 STATE MINUTE
VE_TOTAL PEDS
HARM_EV MAN_COLL
REL_ROAD
ALIGNMNT
C_M_ZONE HOSP_HR
SCH_BUS DRUNK_DR
VE_FORMS
Bat 14 VE_TOTAL PEDS
HARM_EV MAN_COLL
REL_ROAD
ALIGNMNT
TRA_CONT HOSP_HR
HOSP_MN SCH_BUS
CF1 DRUNK_DR
VE_FORMS WEATHER
Tabu 7 MINUTE HARM_EV
REL_ROAD HOSP_MN
SCH_BUS DRUNK_DR
VE_FORMS
Table 3 shows the classification results of the features selected by PSO. It can be observed that
the classification Balanced Accuracy using PSO technique varies between 91% and 96% with the
selected 15 feature set. In this step Threshold selector has achieved the results of 88%, 93%,
96.5% and 96%, in terms of Sensitivity, Specificity, Balanced Accuracy and Precision. While the
score of other classification models were similar with an average Accuracy of 86%, 89% for 𝑓-
measure and 67% for MCC. Observing the data-oriented techniques, obviously under sampling
the majority class is the best algorithm in most measures. It achieved the results of 92%, 90%,
94%, 93% and 77%, in terms of Sensitivity, Accuracy, Precision 𝑓-measure and MCC.
Oversampling minority technique scored the highest Specificity and Balanced Accuracy scores.
We noticed that the overall performance has not been affected by reducing the number of
features from 50 to 15.
Table 3. Performance comparison of the different learning paradigms after applying PSO algorithm
Model Sensitivit
y
Specificity Accuracy Balanced
Accuracy
Precision 𝑓-
measure
MCC
Bagging 0.880 0.835 0.866 0.917 0.921 0.900 0.683
Cost-sensitive 0.874 0.849 0.866 0.924 0.926 0.899 0.682
MetaCost 0.873 0.852 0.867 0.926 0.928 0.900 0.683
Threshold
selector
0.833 0.931 0.864 0.965 0.963 0.893 0.671
SMOTE 0.859 0.936 0.897 0.967 0.930 0.893 0.766
Oversampling
minority
0.842 0.940 0.891 0.970 0.935 0.886 0.746
Class Balancer 0.851 0.885 0.868 0.942 0.881 0.866 0.724
Under
sampling
0.920 0.869 0.904 0.935 0.939 0.929 0.772
Table 4 presents the classification results of the features selected by Cuckoo search technique. It
can be observed that the classification Balanced Accuracy using Cuckoo technique varies
between 91% and 98% with the selected 13 feature set. In this step Threshold selector model has
achieved the best results of 96%, 98% and 98% in terms of Specificity, Balanced Accuracy and
11. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
111
Precision. While the score of other classification models were similar with an average of 88% for
Sensitivity, an Accuracy of 87%, 𝑓-measure of 90% and MCC of 68%. Observing the data-
oriented techniques readings, obviously over sampling the minority class is the best algorithm in
most measures. It achieved the results of 93%, 96%, 94%, 98%, 96%, 94% and 88%, in terms of
Sensitivity, Specificity, Accuracy, Balanced Accuracy, Precision, 𝑓-measure and MCC.
Oversampling minority technique scored the highest Specificity and Balanced Accuracy scores.
We noticed that the overall performance has not been affected by reducing the number of
features too.
Table 4. Performance comparison of the different learning paradigms after applying Cuckoo algorithm
Model Sensitivity Specificity Accuracy Balanced
Accuracy
Precision 𝑓-
measure
MCC
Bagging 0.886 0.829 0.868 0.914 0.918 0.902 0.689
Cost-sensitive 0.887 0.830 0.869 0.915 0.919 0.903 0.691
MetaCost 0.882 0.843 0.870 0.921 0.925 0.903 0.691
Threshold
selector
0.790 0.964 0.844 0.982 0.979 0.874 0.631
SMOTE 0.874 0.885 0.879 0.942 0.883 0.878 0.755
Oversampling
minority
0.931 0.961 0.946 0.981 0.960 0.945 0.879
Class Balancer 0.857 0.874 0.866 0.937 0.872 0.865 0.726
Under
sampling
0.922 0.872 0.906 0.936 0.940 0.931 0.777
Next, Table 5 shows the classification results of the 14 features selected by Bat search technique.
We observed that the classification Balanced Accuracy using Bat algorithm varies between 90%
and 93%. Threshold selector, in this step, has gained the results of 86%, 93% and 93% in terms
of Specificity, Balanced Accuracy and Precision. While the score of other classification models
were similar with an average of 87% for Sensitivity, an Accuracy of 85%, 𝑓-measure of 89% and
MCC of 65%. Observing the data-oriented techniques, obviously over sampling the minority
class is the best algorithm in most measures.
As it achieved the results of 91%, 90%, 94%, 77% and 93%, in terms of Sensitivity, Accuracy,
Precision, MCC and 𝑓-measure. Oversampling minority scored the highest Specificity and
Balanced Accuracy scores. Worth mentioning that SMOTE algorithm scored the best scores in
Specificity and Balanced Accuracy. The results obtained suggest that over sampling the minority
class technique has the ability to enhance the predictive accuracy of the Random Forest.
Table 5. Performance comparison of the different learning paradigms after applying Bat algorithm
Model Sensitivity Specificity Accuracy Balanced
Accuracy
Precision 𝑓-
measure
MCC
Bagging 0.879 0.819 0.860 0.909 0.914 0.896 0.671
Cost-sensitive 0.878 0.818 0.859 0.909 0.913 0.895 0.668
MetaCost 0.874 0.807 0.853 0.903 0.908 0.891 0.654
Threshold
selector
0.848 0.858 0.851 0.929 0.929 0.887 0.647
SMOTE 0.855 0.934 0.895 0.967 0.928 0.890 0.760
Oversampling
minority
0.917 0.883 0.906 0.942 0.945 0.931 0.776
Class Balancer 0.837 0.900 0.869 0.950 0.893 0.864 0.715
Under sampling 0.903 0.847 0.886 0.924 0.928 0.916 0.730
12. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
112
Table 6 shows the classification results of the 7 features selected by Tabu search technique. We
observed that the classification Balanced Accuracy using Tabu technique varies between 91%
and 96%. In this step Threshold selector has obtained the results of 92%, 96% and 96% in terms
of Specificity, Balanced Accuracy and Precision. While the scores of other classification models
were similar with an average of 85% for Accuracy, 91% for Precision, 89% for 𝑓-measure and
66% for MCC. As for the data-oriented techniques, obviously Oversampling the minority class is
the best algorithm in most measures. It achieved the results of 93%, 91%, 94%, 94% and 80%, in
terms of Sensitivity, Accuracy, Precision, MCC and 𝑓-measure. Noteworthy that SMOTE
algorithm scored the best performance at Specificity and Balanced Accuracy with a scores of
91% and 95% respectively.
Table 6. Performance comparison of the different learning paradigms after applying Tabu
Model Sensitivity Specificity Accuracy Balanced
Accuracy
Precision 𝑓- measure MCC
Bagging 0.872 0.826 0.858 0.913 0.916 0.894 0.665
Cost-sensitive 0.870 0.844 0.861 0.922 0.924 0.896 0.671
MetaCost 0.869 0.835 0.858 0.918 0.920 0.894 0.664
Threshold
selector
0.817 0.922 0.850 0.961 0.958 0.882 0.643
SMOTE 0.865 0.913 0.890 0.957 0.908 0.886 0.761
Oversampling
minority
0.937 0.871 0.916 0.936 0.941 0.939 0.804
Class Balancer 0.846 0.861 0.853 0.930 0.858 0.852 0.701
Under sampling 0.916 0.829 0.889 0.915 0.921 0.919 0.740
Now, to test the prediction performance of the agreement between the feature selection
techniques a group of experiments were conducted. Figure 1 visualizes the Venn diagram of the
top five relevant features between PSO, Bat, Cuckoo and Tabu which are: MINUTE,
HARM_EV, REL_ROAD, SCH_BUS and DRUNK_DR. Table 7 shows the classification results
of the above mentioned five relevant features. we observed that the classification Balanced
Accuracy using Tabu technique varies between 87% and 92%. In this step Threshold selector has
obtained the results of 85%, 92% and 92% in terms of Specificity, Balanced Accuracy and
Precision. While the scores of other classification models were similar with an average of 81%
for Accuracy, 83% for Sensitivity, 85% for 𝑓-measure and 55% for MCC.
As for the data-oriented techniques, obviously under sampling the majority class is the best
algorithm in most measures. It achieved the results of 87%, 90% and 88%, in terms of
Sensitivity, Precision and 𝑓-measure. Noteworthy that SMOTE algorithm scored the best
performance at Specificity, Accuracy, Balanced Accuracy and MCC with a scores of 91%, 86%,
95% and 70% respectively. It was fascinating to see that in general all different sets of classifiers
have resulted in an Accuracy above 90%.
Table 7. Performance comparison of the different learning paradigms after choosing most relevant features
Model Sensitivity Specificity Accuracy Balanced
Accuracy
Precision 𝑓-measure MCC
Bagging 0.837 0.756 0.811 0.878 0.882 0.859 0.562
Cost-sensitive 0.827 0.778 0.812 0.889 0.891 0.857 0.563
MetaCost 0.839 0.739 0.807 0.869 0.875 0.856 0.553
Threshold
selector
0.795 0.849 0.812 0.924 0.920 0.853 0.566
SMOTE 0.818 0.910 0.864 0.955 0.900 0.857 0.697
13. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
113
Oversampling
minority
0.821 0.757 0.802 0.879 0.883 0.851 0.539
Class Balancer 0.800 0.864 0.832 0.932 0.855 0.827 0.644
Under sampling 0.870 0.794 0.846 0.897 0.902 0.886 0.639
Figure 1. Relationship between PSO, Cuckoo, Bat and Tabu search algorithms
The last set of experiments confirmed that the suggested technique can efficiently compete with
the best prediction meta-classifiers with least number of features. The suggested classification
model is considered adequate enough for selection as the 10-fold cross-validation provided a
sensitivity value of at least 0.5 and a specificity value not less than 0.5 in all the experiments. It
also indicates that with the assistance of meta-classifiers performance has improved particularly
when handling imbalanced datasets. Moreover, employing Random Forest classifier as a base-
classifier surely improves their prediction accuracy. Among the meta-classifier-based methods,
Threshold selector provided the best performance in many cases. Additionally, Over-sampling
technique results are very satisfactory than other resampling techniques. In this regard, the
proposed technique has gained a mean Overall Accuracy of 91% and a Balanced Accuracy that
varies between 96% to 99% using 7-15 features instead of 50 features (Figure2). Based on the
result analysis, it is suggested that the following factors related to the crash and might affect the
fatality rate are: the minute which the crash occurred, the event that resulted in the most severe
injury, the location of the crash as it relates to its position within or outside the traffic way based
on the “First Harmful the Event”, if a school bus, or motor vehicle functioning as a school bus is
involved, the number of drinking drivers involved in the crash. We agree with previous studies
[17,18] which indicated that human factors and the collision type strongly affect the fatal rate
more than the environmental factors. The results obtained, might help in considering and
evaluating the factors related to fatal accidents.
Figure 2. Performance comparison of the different learning paradigms
14. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
114
5. CONCLUSION
In this paper, we presented a computational method that can accurately determine the key
features that influence in classification in FARS dataset. For that reason, we applied and
compared the prediction performances of several meta-classifiers. Moreover, several data-
oriented approaches were applied to handle the uneven class ratios problem. Results from the
study showed that the Threshold Selection Classifier and Over-sampling technique had the better
predictive ability among the other techniques with using Random Forest as a base-classifier.
Experiments on the FARS dataset empirically proved that our proposed method can reduce the
number of features with almost 90% and obtain satisfactory results.
REFERENCES
[1] World Health Organization- WHO, Global status report on road safety, Available from World Wide
Web: https://www.who.int/violence_injury_prevention/road_safety_status/2018/en/ (accessed
December 2019).
[2] Qiu, C., Wang, C., Fang, B.& Zuo, X., (2014). A multiobjective particle swarm optimization-based
partial classification for accident severity analysis. Appl. Artif.Intell.28,555–576.
[3] Kwon, O.H.; Rhee, W.; & Yoon, Y. (2015). Application of classification algorithms for analysis of
road safety risk factor dependencies. Accident Analysis and Prevention, 75, 1-15.
[4] Kolyshkina, I., & Brookes, R. (2002). Data mining approaches to modeling insurance risk. Retrieved
December 14, 2019, from http:// www.salford-systems.com/doc/insurance.pdf
[5] Li, Y., Guo, H., Xiao, L., Yanan, L., & Jinling, L, (2016) Adapted ensemble classification algorithm
based on multiple classifier system and feature selection for classifying multi-class imbalanced data,
Knowledge-Based Systems, 94, 88–104.
[6] Trac Integrated SCM & Project Management. Fatal Accidents Dataset,
https://wiki.csc.calpoly.edu/datasets/wiki/ HighwayAccidents.
[7] U.S. Department of Transportation, FARS analytic reference guide, 1975 to 2009.
https://crashstats.nhtsa.dot.gov/Api/Public/ ViewPublication/811352, 2010.
[8] Eric M Ossiander & Peter Cummings, (2002), Freeway speed limits and traffic fatalities in
washington state. Accident Analysis & Prevention, 34(1):13–18.
[9] KMA Solaiman, Md Mustafizur Rahman & Nashid Shahriar, (2013), Avra Bangladesh collection,
analysis & visualization of road accident data in Bangladesh, In Proceedings of International
Conference on Informatics, Electronics & Vision, 1–6.
[10] Sachin Kumar & Durga Toshniwal, (2015), Analysing road accident data using association rule
mining. In Proceedings of International Conference on Computing, Communication and Security, 1–
6.
[11] Chang L. & H. Wang, (2006), "Analysis of traffic injury severity: An application of non-parametric
classification tree techniques Accident analysis and prevention", Accident analysis and prevention,
Vol. 38(5), pp 1019- 1027.
[12] S. Krishnaveni & M. Hemalatha, (2011), A perspective analysis of traffic accident using data mining
techniques. International Journal of Computer Applications, 23(7):40–48.
[13] Kwon, O.H.; Rhee, W. & Yoon, Y. (2015). Application of classification algorithms for analysis of
road safety risk factor dependencies. Accident Analysis and Prevention, 75, 1-15.
[14] T. K. Bahiru, D. Kumar Singh & E. A. Tessfaw, (2018), "Comparative Study on Data Mining
Classification Algorithms for Predicting Road Traffic Accident Severity," Second International
Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore,
2018, 1655-1660.
[15] Silva, HCE & Saraee, MH, (2019), Predicting road traffic accident severity using decision trees and
time-series calendar heatmaps , in: The 6th IEEE Conference on Sustainable Utilization and
Development in Engineering and Technology (2019 IEEE CSUDET), 7 - 9 November 2019, Penang,
Malaysia.
[16] Mujalli, R.O.; López, G. & Garach, L. (2016). Bayes classifiers for imbalanced traffic accidents
datasets. Accident Analysis and Prevention, 88, 37-51.
15. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
115
[17] Liling Li, Sharad Shrestha & Gongzhu Hu, (2017), “Analysis of Road Traffic Fatal Accidents Using
Data Mining Techniques”, Software Engineering Research, Management and Applications (SERA),
2017 IEEE 15th International Conference, DOI: 10.1109/SERA.2017.7965753
[18] Pakgohar A., Tabrizi R. S., Khalili M. & Esmaeili A., (2011), The role of human factor in incidence
and severity of road crashes based on the cart and LR regression: a data mining approach. Procedia
Computer Science, World Conference on Information Technology, 3, pp. 764– 769.
[19] Kotsiantis S, Kanellopoulos D & Pintelas P, (2006) Handling imbalanced datasets: a review.
GESTS Int Trans Comput Sci Eng 30(1):25–36
[20] López V, Fernández A, Moreno-Torres JG & Herrera F, (2012), Analysis of preprocessing vs. cost-
sensitive learning for imbalanced classification: open problems on intrinsic data character- istics.
Expert Syst Appl 39:6585–6608. https://doi.org/10.1016/j. eswa.2011.12.043
[21] Schierz AC, (2009), Virtual screening of bioassay data. J Chemin- form 1:21.
https://doi.org/10.1186/1758-2946-1-21
[22] Lin W-J & Chen JJ, (2013), Class-imbalanced classifiers for high-dimensional data. Brief Bioinform
14:13–26. https://doi. org/10.1093/bib/bbs006.
[23] N. D. Singh, (2018), "Clustering and learning from imbalanced data", arXiv preprint
arXiv:1811.00972.
[24] Dietterich, T.G., (2000), An Experimental Comparison of Three Methods for Constructing
Ensembles of Decision Trees: Bagging, Boosting, and Randomization, Machine Learning, 40, 139-
157. http://dx.doi.org/10.1023/A:1007607513941
[25] Breiman L, (2001), Random Forests, Mach Learn 45:5–32
[26] L. Breiman, Bagging predictors, Machine Learning, 24:123–140, 1996. L. Breiman. Some Infinity
Theory for Predictor Ensembles, Technical Report 577, UC Berkeley, 2000. URL
http://www.stat.berkeley.edu/˜breiman.
[27] Oshiro TM, Perez PS & Baranauskas JA, (2012), How many trees in a Random Forest?, Machine
learning and data mining in pattern recognition, Springer, Berlin, pp 154–168.
[28] Sun Y, Kamel MS, Wong AKC & Wang Y., (2007), Cost-sensitive boosting for classification of
imbalanced data. Pattern Recognition, 40: 3358–3378.
[29] Guo Haixiang, Li Yijing, Jennifer Shang, Gu Mingyun, Huang Yuanyue & Gong Bing, (2016),
Learning from class-imbalanced data: Review of methods and applications, Expert Systems with
Applications.
[30] Kotsiantis S, Kanellopoulos D & Pintelas P, (2006), Handling imbalanced datasets: a review, GESTS
Int Trans Comput Sci Eng 30(1):25–36.
[31] Chawla NV, Japkowicz N & Kotcz A, (2004), Editorial: special issue on learning from imbalanced
data sets. SIGKDD Explor Newsl 6:1–6. https://doi.org/10.1145/1007730.1007733.
[32] Powers D, (2011), Evaluation: from precision, recall and f-measure to roc., informedness,
markedness & correlation. J Mach Learn Technol 2:37–63.
[33] Domingos, P., (1999), MetaCost: A general method for making classifiers cost-sensitive, In
Proceedings of the 15th international conference on knowledge discovery and data mining, 155–164.
[34] Quinlan, J.R. (1996), Bagging, boosting, and C4.5, in Proceedings of the 13th National Conference
on Artificial Intelligence, pp. 725-730.
[35] Schapire, R.E., Y. Freund, P. Bartlett, & W.S. Lee (1997), Boosting the margin: A new explanation
for the effectiveness of voting methods, in Proceedings of the Fourteenth International Conference on
Machine Learning, 322-330.
[36] Bauer, E. & Kohavi, R. (1999), An Empirical Comparison of Voting Classification Algorithms:
Bagging, Boosting, and Variants. Machine Learning, Kluwer Academic Publishers, 36, 105-139.
[37] F. Glover, (1989), “Tabu search—part I,” ORSA Journal on Computing, vol. 1, no. 3, 190–206.
[38] F. Glover, (1990), “Tabu search—part II,” ORSA Journal on Computing, vol. 2, no. 1, 4–32,.
[39] Tahir, M.A., Bouridane, A., Kurugollu, F. & Amira, A., (2004), Feature Selection using Tabu Search
for Improving the Classification Rate of Prostate Needle Biopsies, In: Proc. 17th International
Conference on Pattern Recognition (ICPR 2004), Cambridge, UK.
[40] Tahir, M.A., Bouridane, A., Kurugollu, F., 2004b. Simulataneous Feature Selection and Weigthing
for Nearest Neighbor Using Tabu Search. In: Lecture Notes in Computer Science (LNCS 3177), 5th
International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2004),
Exeter, UK.
16. International Journal of Artificial Intelligence and Applications (IJAIA), Vol.11, No.4, July 2020
116
[41] Korycinski, D., Crawford, M., Barnes, J.W & Ghosh, J., (2003), Adaptive feature selection for
hyperspectral data analysis using a binary hierarchical classifier and Tabu Search, In: Proceedings of
the IEEE International Geoscience and Remote Sensing Symposium, IGARSS.
[42] J. Kennedy &R.C. Eberhart, (2001), “Swarm intelligence”, Morgan Kaufmann.
[43] J. Kennedy & R. Eberhart, (1997), “A discrete binary version of the particle swarm algorithm”,
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, Vol.5, 4104–
4108.
[44] Xin-She Yang & Suash Deb, (2009), Cuckoo search via Lévy flights. In Nature & Biologically
Inspired Computing, 2009. NaBIC 2009. World Congress on, 210–214.
[45] Xin-She Yang & Suash Deb, (2014), “Cuckoo search: recent advances and applications,” Neural
Computing and Applications, vol. 24, no. 1, 169–174.
[46] X.-S. Yang, (2010), A new metaheuristic bat-inspired algorithm. In Nature Inspired Cooperative
Strategies for Optimization (NICSO 2010), volume 284 of Studies in Computational Intelligence,
Springer Berlin Heidelberg, 65–74.
[47] Powers, D.M.W., (2011), Evaluation: From Precision, Recall and F-Measure to ROC. Inf. Mark.
Correl. J. Mach. Learn. Technol. 2, 37–63.
[48] Hall M, Frank E, Holmes G et al (2009) The WEKA data mining software: an update. SIGKDD
Explor Newsl 11:10–18. https:// doi.org/10.1145/1656274.1656278
[49] Oshiro TM, Perez PS, Baranauskas JA (2012) How many trees in a Random Forest? In: Machine
learning and data mining in pattern recognition. Springer, Berlin, pp 154–168.