With the use of SQL Server and SSAS, performed analysis for Road Accidents in the United Kingdom, to study how different road condition, road surface, Potholes, weather condition, light situation are going to affect on driving skills and leads to accidents. In this Study R programming used to Clean Database and SQL server and SSAS to Create ETL Cube to generate reports flexibly.
PREDICTION OF ROAD ACCIDENT MODELLING FOR INDIAN NATIONAL HIGHWAYSIAEME Publication
The objective of this research article is to identify the most critical safety influencing variables of a section of four-lane National Highway-18(old)/40(New) through statistical models that explains the relationship between frequency of accident count and highway safety variables. The Highway traverses mainly through a plain terrain of mostly agricultural areas. The study is for newly constructing Four-Lane road between chainage 224.000 (Chagalamarri) to 359.9(Kurnool) to identify all safety deficiencies responsible for road accidents. The predictive ability using Multiple linear regression model is under two categories: First for the 2 lane sections and second for 4 lane sections separately. The validation tools were applied to examine the ability of models to predict accidents.
Cisco Smart Intersections: IoT insights using video analytics and AICarl Jackson
In this trial, IoT, Video Analytics, Deep Learning (DL) and Artificial Intelligence (AI), for the purpose of traffic flow assessment and insights into road user behaviour, were evaluated at an intersection at the AIMES testbed in Melbourne¹ in partnership with: the University of Melbourne, Department of Transport (DOT), IAG and Cisco.
Towards Smart Cities Development: A Study of Public Transport System and Traf...sarfraznawaz
Increasing number of privately owned vehicles are depicting Malaysians preferred mode of mobility and lack of interest in the public transport system. In most developing countries such as Malaysia, motorized vehicles are the major contributors to air pollution in urban zones. Air pollution is a silent killer as it infiltrates the vital organs, leading to serious diseases and death. This research critically analyses the emissions of air pollutants such as CO, NO2, SO2, hydrocarbon, and PM from various sources in Malaysia with emphasis mainly on the emission of pollutants from motor vehicles. This research also discusses the public transport initiatives undertaken by the government of Malaysia such as enhancing the bus and rail system, transforming Malaysia’s taxi system, managing travel demand and enhancing the integration of urban public transport system. Furthermore, considering the smart cities initiatives, this research identified that weather, safety, security and inappropriate infrastructure are major barriers in Malaysia’s move towards the implementation of smart and eco-friendly mobility practices such as cycling, carpooling and car sharing.
A multi-objective evolutionary scheme for control points deployment in intell...IJECEIAES
One of the problems that hinder emergency in developing countries is the problem of monitoring a number of activities on inter-urban roadway networks. In the literature, the use of control points is proposed in the context of these countries in order to ensure efficient monitoring, by ensuring a good coverage while minimizing the installation costs as well as the number of accidents across these road networks. In this work, we propose an optimal deployment of these control points from several optimization methods based on some evolutionary multi-objective algorithms: the Non dominated sorting genetic algorithm-II (NSGA-II); the multi-objective particle swarm optimization (MOPSO); the strength pareto evolutionary algorithm-II (SPEA-II); and the pareto envelope based selection algorithm-II (PESA-II). We performed the tests and compared these deployments using pareto front and performance indicators like the spread and hypervolume and the inverted generational distance (IGD). The results obtained show that the NSGA-II method is the most adequate in the deployment of these control points.
International Refereed Journal of Engineering and Science (IRJES)irjes
International Refereed Journal of Engineering and Science (IRJES) is a leading international journal for publication of new ideas, the state of the art research results and fundamental advances in all aspects of Engineering and Science. IRJES is a open access, peer reviewed international journal with a primary objective to provide the academic community and industry for the submission of half of original research and applications
PREDICTION OF ROAD ACCIDENT MODELLING FOR INDIAN NATIONAL HIGHWAYSIAEME Publication
The objective of this research article is to identify the most critical safety influencing variables of a section of four-lane National Highway-18(old)/40(New) through statistical models that explains the relationship between frequency of accident count and highway safety variables. The Highway traverses mainly through a plain terrain of mostly agricultural areas. The study is for newly constructing Four-Lane road between chainage 224.000 (Chagalamarri) to 359.9(Kurnool) to identify all safety deficiencies responsible for road accidents. The predictive ability using Multiple linear regression model is under two categories: First for the 2 lane sections and second for 4 lane sections separately. The validation tools were applied to examine the ability of models to predict accidents.
Cisco Smart Intersections: IoT insights using video analytics and AICarl Jackson
In this trial, IoT, Video Analytics, Deep Learning (DL) and Artificial Intelligence (AI), for the purpose of traffic flow assessment and insights into road user behaviour, were evaluated at an intersection at the AIMES testbed in Melbourne¹ in partnership with: the University of Melbourne, Department of Transport (DOT), IAG and Cisco.
Towards Smart Cities Development: A Study of Public Transport System and Traf...sarfraznawaz
Increasing number of privately owned vehicles are depicting Malaysians preferred mode of mobility and lack of interest in the public transport system. In most developing countries such as Malaysia, motorized vehicles are the major contributors to air pollution in urban zones. Air pollution is a silent killer as it infiltrates the vital organs, leading to serious diseases and death. This research critically analyses the emissions of air pollutants such as CO, NO2, SO2, hydrocarbon, and PM from various sources in Malaysia with emphasis mainly on the emission of pollutants from motor vehicles. This research also discusses the public transport initiatives undertaken by the government of Malaysia such as enhancing the bus and rail system, transforming Malaysia’s taxi system, managing travel demand and enhancing the integration of urban public transport system. Furthermore, considering the smart cities initiatives, this research identified that weather, safety, security and inappropriate infrastructure are major barriers in Malaysia’s move towards the implementation of smart and eco-friendly mobility practices such as cycling, carpooling and car sharing.
A multi-objective evolutionary scheme for control points deployment in intell...IJECEIAES
One of the problems that hinder emergency in developing countries is the problem of monitoring a number of activities on inter-urban roadway networks. In the literature, the use of control points is proposed in the context of these countries in order to ensure efficient monitoring, by ensuring a good coverage while minimizing the installation costs as well as the number of accidents across these road networks. In this work, we propose an optimal deployment of these control points from several optimization methods based on some evolutionary multi-objective algorithms: the Non dominated sorting genetic algorithm-II (NSGA-II); the multi-objective particle swarm optimization (MOPSO); the strength pareto evolutionary algorithm-II (SPEA-II); and the pareto envelope based selection algorithm-II (PESA-II). We performed the tests and compared these deployments using pareto front and performance indicators like the spread and hypervolume and the inverted generational distance (IGD). The results obtained show that the NSGA-II method is the most adequate in the deployment of these control points.
International Refereed Journal of Engineering and Science (IRJES)irjes
International Refereed Journal of Engineering and Science (IRJES) is a leading international journal for publication of new ideas, the state of the art research results and fundamental advances in all aspects of Engineering and Science. IRJES is a open access, peer reviewed international journal with a primary objective to provide the academic community and industry for the submission of half of original research and applications
U.S. Road Accidents Data Analysis and VisualizationMrinalini Sundar
With the US accident rate as a case study, we are showing you how automated, code-free integration for data housed in major platforms to Azure/Snowflake/Amazon Redshift or Google BigQuery is super easy with Datom.ai. All data transfers are done with a drag and drop interface and are based on a transparent pricing mechanism based on actual usage.
CHARACTERIZING HAZARDOUS ROAD LOCATIONS AND BLACK SPOTS ON ROUTE N8 (DHAKA-BA...Fayaz Uddin
Road traffic accidents and corresponding causality are the most concerning issues in the transportation sector of a developing
country like Bangladesh where road crashes are remarkably high. According to police reported road traffic accident database,
every year about 2800 or more accidents occur in Bangladesh. This research analyzes the various accident data from year 2007
to 2012 using Microcomputer Accident Analysis Package (MAAP5) software in route N8 (Dhaka – Mawa – Barisal – Patuakhali
National Highway) in Bangladesh. This research reveals accident prone locations which are commonly termed as black spot and
Hazardous Road location (HRL) on the route N8 followed by establishing maps by Geographic Information System (GIS). Headon,
rear-end, overturning, side-swipe and hit-pedestrian are the most dominant types of accidents. Analysis shows that maximum
number of accidents occurred in fair weather in route N8. The result clearly indicates that buses contribute mostly in the
accidents.
Towards Improving Crash Data Management System in Gulf CountriesIJERA Editor
Scientific and analytical approaches to accident data collection, storage and analysis are essential in dealing with road safety problems. Police accident records in the majority of countries form the main (and sometimes the only) source of accident data. Access to the accident database is also important to identifying specific safety problems and evaluating the effectiveness of the countermeasure introduced. Accident data collection and analysis offered by technological innovation such as Electronic Data Entry (EDE), Electronic Data transfer (EDT), and Geographic Information system (GIS) are implemented in developed countries. Developing countries, including the Gulf countries, should take advantage of the experience of developed countries on how the advance accident data management system works to identifying, more accurately, the main factors contributing to traffic accident. The main purpose of this research is to provide information on accident statistics process in Virginia state, starting from the time of accident occurring until it is stored in the database, with the aim of using it towards improving the process of collecting and maintaining accident data system in Gulf countries. The task is performed by reviewing the relevant international literature and interviewing police officers in charge and academic researchers in order to compare the accident data management system and also the quality of the data. Recommendations towards developing the crash data management system will be obtained based on the research results and international experience.
This work discusses the study and development of a graphical interface and implementation of a machine learning model for vehicle traffic injury and fatality prediction for a specified date range and for a certain zip (US postal) code based on the New York City's (NYC) vehicle crash data set. While previous studies focused on accident causes, little insight has been offered into how such data may be utilized to forecast future incidents, Studies that have historically concentrated on certain road segment types, such as highways and other streets, and a specific geographic region, this study offers a citywide review of collisions. Using cutting-edge database and networking technology, a user-friendly interface was created to display vehicle crash series. Following this, a support vector machine learning model was built to evaluate the likelihood of an accident and the consequent injuries and deaths at the zip code level for all of NYC and to better mitigate such events. Using the visualization and prediction approach, the findings show that it is efficient and accurate. Aside from transportation experts and government policymakers, the machine learning approach deliver useful insights to the insurance business since it quantifies collision risk data collected at specific places.
Analysis of Machine Learning Algorithm with Road Accidents Data SetsDr. Amarjeet Singh
Beginning at now, street transport framework neglect to alter up to the exponential expansion in vehicular masses and to ascertaining the quickest driving courses and catastrophes inside observing differing traffic conditions is a critical issue right presently structures. To upset this issue is to explore the vehicle division dataset with bundle learning technique for finding the best street choice without calamity gauging by want aftereffects of best accuracy count by looking at oversaw AI figuring. In bits of information and AI, bundle strategies utilize diverse learning calculations to give indications of progress prudent execution. The assessment of dataset by facilitated AI technique (SMLT) to get two or three data takes after, factor perceiving proof, univariate evaluation, bivariate and multi-variate appraisal, missing worth medications and separate the information support, information cleaning/organizing and information perception will be done with everything taken into account given dataset. In addition, to look at and talk about the presentation of different AI figuring estimations from the given vehicle division dataset with assessment of GUI based street fiasco want by given attributes.
COUNTRIES CAPABILITIES TO ACHIEVE ambitiousAMBITIOUSBambangWahono3
The purpose of the research paper is to observe and analyze how the economic growth of EU countries is
accompanied by growth of motorization rate, which causes accidents in roads which causes huge amounts of killed
and injured people. Research methodology is statistical analysis of economic growth, motorization rate and road
accidents in the EU countries during the period of 2010–2020. In the research paper the quantitative analysis and
comparison method are applied. Findings: research paper shows how in the EU countries increase of motor vehicles
causes road accidents and mortality of people. According to the level of economic development there are differences
between growth of motorization rate and decrease of fatalities. Because at low income levels the rate of increase in
motor vehicles outpaces the decline in fatalities per motor vehicle. At higher income levels, the reverse occurs.
Practical implications: research paper demonstrates for road traffic safety authorities the need to know safety
performance indicators and take them into account in preparing of legislation to strengthen EU vision of “zero
victims”, and give better protection for victims of motor vehicle accidents. Originality – paper analyses the relationship
between motorization levels and fatalities of different economic growth EU countries during last decades.
Minimum 350-500 Words each answer Academic Sources DiscussioAlleneMcclendon878
Minimum 350-500 Words each answer
Academic Sources
Discussion Question 1:
Government economic studies reveal that young adults, not middle-aged or older adults, are having the most difficult time in today’s economy. Although the nation’s labor market shows a decline in the unemployment rate, the percentage of young adults, ages 18 to 24, currently employed (54 percent) is at the lowest level since the government data collection began in 1948. If you were working for a national survey organization doing a general public survey of young adults and older adults, what topics and questions would you design into your survey to elaborate on this finding?
Discussion Question 2:
One design problem in the development of measurement instruments concerns the sequence of questions. What suggestions would you give to a novice researcher designing his or her first questionnaire?
Business Research Methods, 13e/Schindler
1
>cases
State Farm, the nation’s largest auto insurer, distributed a list of the 10 most
dangerous intersections in the United States based on crashes resulting in claims
by its policyholders. What started as a study to reduce risk turned into an ongoing
study that directs a major public relations effort: State Farm provides funds for
communities to further research their dangerous intersections and initiate improve-
ments based on the research. This case tells you how the State Farm Dangerous
Intersections initiative got started and how it is done. www.statefarm.com
>Abstract
>The Scenario
State Farm Insurance has a rich history of proactive safety involvement in auto and
appliance design to reduce injury and property loss. In June 2001, State Farm
Insurance, Inc., released the second report in its Dangerous Intersection reporting
series. State Farm modeled its program after an initiative by the Insurance Corporation
of British Columbia, Canada (ICBC), and the American Automobile Association of
Michigan (AAA) to help position the nation’s largest auto insurer as the most safety-
conscious insurer. ICBC had patterned its program on an earlier effort in Victoria,
Australia. AAA, in turn, benchmarked its program on the ICBC program. AAA
invited State Farm to help fund one of its intersection studies. State Farm saw this as
an opportunity to expand its effort into a nationwide campaign in 1999. “The 2001
study is part of a larger effort focused on loss prevention and improving the safety of
intersections around the U.S.A.,” shared State Farm research engineer John
Nepomuceno. State Farm has allocated significant resources as well as funds to the
initiative. Since its inception, every city with an intersection on the overall list of
dangerous intersections is eligible to apply for a $20,000 grant to defray the cost of
a comprehensive traffic engineering study of the intersection. Additionally, each city
named to the national top 10 dangerous intersection list is eligible for a grant of
$100,000 per intersection to defray s ...
The fatality of traffic accidents of the world population is approximately 1.2 million people every year. According to the World Health Organization(2004), related injuries from road incidents will rank 3rd for global burden of disease in 2030. In order to tackle traffic accidents effectively, one needs to analyse their traffic pattern. The traffic accident black spot programme is developed from analysis of traffic accidents (Chris’s Britain Road Directory, 2017). Black spot or black site refers to area with high traffic accident risk. In 1955, the UK first introduced an unprecedented type of traffic sign – Accident Black Spot Sign (The National Archives, 2017). Since then, more and more Commonwealth countries followed the UK to promote and develop their own black spot investigations. In this paper, I will first explain why traffic accidents occurs and common determination methods of black spots. After that, I will present the current situation of Hong Kong.
Cisco Smart Intersections: IoT insights using wifiCarl Jackson
In this trial an Edge hosted Wi-Fi solution was evaluated for the purpose of extracting insights into road user behaviour and performance at the intersection within the AIMES testbed in Melbourne, in partnership with University of Melbourne, Department of Transport (DOT), Cohda Wireless, IAG and Cisco.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
U.S. Road Accidents Data Analysis and VisualizationMrinalini Sundar
With the US accident rate as a case study, we are showing you how automated, code-free integration for data housed in major platforms to Azure/Snowflake/Amazon Redshift or Google BigQuery is super easy with Datom.ai. All data transfers are done with a drag and drop interface and are based on a transparent pricing mechanism based on actual usage.
CHARACTERIZING HAZARDOUS ROAD LOCATIONS AND BLACK SPOTS ON ROUTE N8 (DHAKA-BA...Fayaz Uddin
Road traffic accidents and corresponding causality are the most concerning issues in the transportation sector of a developing
country like Bangladesh where road crashes are remarkably high. According to police reported road traffic accident database,
every year about 2800 or more accidents occur in Bangladesh. This research analyzes the various accident data from year 2007
to 2012 using Microcomputer Accident Analysis Package (MAAP5) software in route N8 (Dhaka – Mawa – Barisal – Patuakhali
National Highway) in Bangladesh. This research reveals accident prone locations which are commonly termed as black spot and
Hazardous Road location (HRL) on the route N8 followed by establishing maps by Geographic Information System (GIS). Headon,
rear-end, overturning, side-swipe and hit-pedestrian are the most dominant types of accidents. Analysis shows that maximum
number of accidents occurred in fair weather in route N8. The result clearly indicates that buses contribute mostly in the
accidents.
Towards Improving Crash Data Management System in Gulf CountriesIJERA Editor
Scientific and analytical approaches to accident data collection, storage and analysis are essential in dealing with road safety problems. Police accident records in the majority of countries form the main (and sometimes the only) source of accident data. Access to the accident database is also important to identifying specific safety problems and evaluating the effectiveness of the countermeasure introduced. Accident data collection and analysis offered by technological innovation such as Electronic Data Entry (EDE), Electronic Data transfer (EDT), and Geographic Information system (GIS) are implemented in developed countries. Developing countries, including the Gulf countries, should take advantage of the experience of developed countries on how the advance accident data management system works to identifying, more accurately, the main factors contributing to traffic accident. The main purpose of this research is to provide information on accident statistics process in Virginia state, starting from the time of accident occurring until it is stored in the database, with the aim of using it towards improving the process of collecting and maintaining accident data system in Gulf countries. The task is performed by reviewing the relevant international literature and interviewing police officers in charge and academic researchers in order to compare the accident data management system and also the quality of the data. Recommendations towards developing the crash data management system will be obtained based on the research results and international experience.
This work discusses the study and development of a graphical interface and implementation of a machine learning model for vehicle traffic injury and fatality prediction for a specified date range and for a certain zip (US postal) code based on the New York City's (NYC) vehicle crash data set. While previous studies focused on accident causes, little insight has been offered into how such data may be utilized to forecast future incidents, Studies that have historically concentrated on certain road segment types, such as highways and other streets, and a specific geographic region, this study offers a citywide review of collisions. Using cutting-edge database and networking technology, a user-friendly interface was created to display vehicle crash series. Following this, a support vector machine learning model was built to evaluate the likelihood of an accident and the consequent injuries and deaths at the zip code level for all of NYC and to better mitigate such events. Using the visualization and prediction approach, the findings show that it is efficient and accurate. Aside from transportation experts and government policymakers, the machine learning approach deliver useful insights to the insurance business since it quantifies collision risk data collected at specific places.
Analysis of Machine Learning Algorithm with Road Accidents Data SetsDr. Amarjeet Singh
Beginning at now, street transport framework neglect to alter up to the exponential expansion in vehicular masses and to ascertaining the quickest driving courses and catastrophes inside observing differing traffic conditions is a critical issue right presently structures. To upset this issue is to explore the vehicle division dataset with bundle learning technique for finding the best street choice without calamity gauging by want aftereffects of best accuracy count by looking at oversaw AI figuring. In bits of information and AI, bundle strategies utilize diverse learning calculations to give indications of progress prudent execution. The assessment of dataset by facilitated AI technique (SMLT) to get two or three data takes after, factor perceiving proof, univariate evaluation, bivariate and multi-variate appraisal, missing worth medications and separate the information support, information cleaning/organizing and information perception will be done with everything taken into account given dataset. In addition, to look at and talk about the presentation of different AI figuring estimations from the given vehicle division dataset with assessment of GUI based street fiasco want by given attributes.
COUNTRIES CAPABILITIES TO ACHIEVE ambitiousAMBITIOUSBambangWahono3
The purpose of the research paper is to observe and analyze how the economic growth of EU countries is
accompanied by growth of motorization rate, which causes accidents in roads which causes huge amounts of killed
and injured people. Research methodology is statistical analysis of economic growth, motorization rate and road
accidents in the EU countries during the period of 2010–2020. In the research paper the quantitative analysis and
comparison method are applied. Findings: research paper shows how in the EU countries increase of motor vehicles
causes road accidents and mortality of people. According to the level of economic development there are differences
between growth of motorization rate and decrease of fatalities. Because at low income levels the rate of increase in
motor vehicles outpaces the decline in fatalities per motor vehicle. At higher income levels, the reverse occurs.
Practical implications: research paper demonstrates for road traffic safety authorities the need to know safety
performance indicators and take them into account in preparing of legislation to strengthen EU vision of “zero
victims”, and give better protection for victims of motor vehicle accidents. Originality – paper analyses the relationship
between motorization levels and fatalities of different economic growth EU countries during last decades.
Minimum 350-500 Words each answer Academic Sources DiscussioAlleneMcclendon878
Minimum 350-500 Words each answer
Academic Sources
Discussion Question 1:
Government economic studies reveal that young adults, not middle-aged or older adults, are having the most difficult time in today’s economy. Although the nation’s labor market shows a decline in the unemployment rate, the percentage of young adults, ages 18 to 24, currently employed (54 percent) is at the lowest level since the government data collection began in 1948. If you were working for a national survey organization doing a general public survey of young adults and older adults, what topics and questions would you design into your survey to elaborate on this finding?
Discussion Question 2:
One design problem in the development of measurement instruments concerns the sequence of questions. What suggestions would you give to a novice researcher designing his or her first questionnaire?
Business Research Methods, 13e/Schindler
1
>cases
State Farm, the nation’s largest auto insurer, distributed a list of the 10 most
dangerous intersections in the United States based on crashes resulting in claims
by its policyholders. What started as a study to reduce risk turned into an ongoing
study that directs a major public relations effort: State Farm provides funds for
communities to further research their dangerous intersections and initiate improve-
ments based on the research. This case tells you how the State Farm Dangerous
Intersections initiative got started and how it is done. www.statefarm.com
>Abstract
>The Scenario
State Farm Insurance has a rich history of proactive safety involvement in auto and
appliance design to reduce injury and property loss. In June 2001, State Farm
Insurance, Inc., released the second report in its Dangerous Intersection reporting
series. State Farm modeled its program after an initiative by the Insurance Corporation
of British Columbia, Canada (ICBC), and the American Automobile Association of
Michigan (AAA) to help position the nation’s largest auto insurer as the most safety-
conscious insurer. ICBC had patterned its program on an earlier effort in Victoria,
Australia. AAA, in turn, benchmarked its program on the ICBC program. AAA
invited State Farm to help fund one of its intersection studies. State Farm saw this as
an opportunity to expand its effort into a nationwide campaign in 1999. “The 2001
study is part of a larger effort focused on loss prevention and improving the safety of
intersections around the U.S.A.,” shared State Farm research engineer John
Nepomuceno. State Farm has allocated significant resources as well as funds to the
initiative. Since its inception, every city with an intersection on the overall list of
dangerous intersections is eligible to apply for a $20,000 grant to defray the cost of
a comprehensive traffic engineering study of the intersection. Additionally, each city
named to the national top 10 dangerous intersection list is eligible for a grant of
$100,000 per intersection to defray s ...
The fatality of traffic accidents of the world population is approximately 1.2 million people every year. According to the World Health Organization(2004), related injuries from road incidents will rank 3rd for global burden of disease in 2030. In order to tackle traffic accidents effectively, one needs to analyse their traffic pattern. The traffic accident black spot programme is developed from analysis of traffic accidents (Chris’s Britain Road Directory, 2017). Black spot or black site refers to area with high traffic accident risk. In 1955, the UK first introduced an unprecedented type of traffic sign – Accident Black Spot Sign (The National Archives, 2017). Since then, more and more Commonwealth countries followed the UK to promote and develop their own black spot investigations. In this paper, I will first explain why traffic accidents occurs and common determination methods of black spots. After that, I will present the current situation of Hong Kong.
Cisco Smart Intersections: IoT insights using wifiCarl Jackson
In this trial an Edge hosted Wi-Fi solution was evaluated for the purpose of extracting insights into road user behaviour and performance at the intersection within the AIMES testbed in Melbourne, in partnership with University of Melbourne, Department of Transport (DOT), Cohda Wireless, IAG and Cisco.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
1. Analysis of Road Accidents in United Kingdom
Tushar Shailesh Dalvi
x18134301
April 12, 2019
Abstract
Road traffic safety is one of the main Concern for department of transport also
for any country citizens. In order to provide maximum safety to everyone gov-
ernment bodies, local agencies, department of transport are continually evaluating
on current strategies of transportation. In order to reduce Accidents and avoid
causality, careful analysis of road traffic data should be done which can lead us to
safe driving environment where we can rigorously reduce casualties. This paper is
going to provide you a simple but useful analysis of UK Road Accident data which
can lead to reduce number of Road accident. In this paper I am going to study
how different road condition, road surface, Potholes are going to affect on driving
skills and leads to accidents. Another factors like weather condition, light situation
is how much does affect on the drivers visibility. How factors will really help us to
lead to analyse and understand the real cause of accident from first person perspec-
tive. This all analysis will also Overview the relationship between Vehicle collision
approach, vehicle condition, type of vehicle, drivers age group and Sexof thedriver.
To enhance this report, we will search in which are most accidents happened and
police officer or local authority attended accident spot or not.
1 Introduction
Road traffic crashes are one of the world’s largest public health and injury prevention
problems. The problem is all the more acute because the victims are overwhelmingly
healthy before their crashes and after crash they have injuries either physically or men-
tally. A report published by the WHO in 2004 estimated that some 1.2 million people
werekilledand50millioninjuredintrafficcollisionsontheroadsaroundtheworldeach
year and was the leading cause of death among children 10 to 19 years of age. The report
also noted that the problem was most severe in developing countries and that simple
prevention measures could halve the number of deaths Road traffic safety (2019)
Theproblemofdeathsandinjuryasaresultofroadaccidentsisnowacknowledgedto
beaglobalphenomenonwithauthoritiesinvirtuallyallcountriesoftheworldconcerned
about the growth in the number of people killed and seriously injured on their roads. In
recent years there have been twomajor studies of causes of death worldwide which have
beenpublishedintheGlobalBurdenofDisease(1996,WorldHealthOrganisation,World
Bank and Harvard University) and in the World Health Report Making a Difference’
(WHO 1999). These publications show that in 1990 road accidents as a cause of death
or disability were by no means insignificant, lying in ninth place out of a total of over
1
2. 100 separately identified causes. However, by the year 2020 forecasts suggest that as a
cause of death, road accidents will move up to sixth place and in terms of years of life
lost (YLL) and disability adjusted life years (DALYs)1 will be in second and third place
respectively Jacobs & Aeron-Thomas (2000).
With the reference of this Reports and Findings I would like to contribute some
own findings that could help to reduce some road accidents and show some light in
Road Accident relationship with different factors. With the help of Data warehousing
technique, i am going to develop a system which can fulfil below Requirements:
(Req-1)Is there any relation between Road Death count and Population?
(Req-2)What is the relation between the Road Accident death and Car and Bikes Regis-
tration?
(Req-3)At what Rank United kingdom stands in Road Accident fatality count with other
European Countries?
2 Data Sources
To complete This project, I have used 3 data sets from 3 different Sources, of which 1st
is Structured data set, 2nd dataset is from Statista and 3rd data is unstructured which
is Scrape from PDF and Wikipedia. All data sources are explained below.
Source Type Brief Summary
Kaggle Structured Kaggle Data set contained all detailed in-
formation about number of records for year
from2011to2017withnumberofcasualties,
speed limit and other related information.
Statista Structured This Data set contained number of sales fro
Bike and Car from year 2011 to 2017
PDF Unstructured In this Data set we got information about
RoaddeathNumberforallEuropeanUnion
for year2017.
Wikipedia Unstructured In this Data set we got information about
United Kingdom Population from 2011 to
2017.
Table 2: Summary of sources of data used in the project
3. 2.1 Source 1: Kaggle (Structure)
Toperformprojectmy1stDatasetisfromKaggleWhichcontainedtheinformationabout
United Kingdom Road Accident information and which Vehicle involved in Accident From
year 2005 to 2017. For the project work I download ONE data set which is Road Safety
Data - Accidents 2017, Data sets Created date on website are Tuesday, 14 August 2018
11:03:43 GMT+0100 (British Summer Time) and latest updated on Tuesday12 January
2019 12:23:24 GMT +0000 (Greenwich Mean Time), which fulfil the project requirement
that structured data should be created within 1 year. Data set was in Zip Format which
containing Accident Information.csv and Vehicle Information.csv format files. I have used
AccidentInformation.csvfiletoExploredatamoreandfindproperconclusion. Mainfile
which is Accident Information.csv which holds a data about Accident information in
which Accident Index, Accident Severity,Number of Casualties and year are very crucial
fields,expect from thesefieldother fields arelistedbelow which Ialso usedto get proper
insight of United Kingdom Road Accident data.
Figure 1: Source 1
Kaggle Source: https://www.kaggle.com/tsiaras/uk-road-safety-accidents-and-vehicles/
activity
4. 2.2 Source 2: Statista
Second dataset is downloaded from this Statista website Statista Data set contained to-
tal number of Motorcycle and car registered in Unitedkingdom from year 2000 to 2017.
There were two excel sheets for Statista data. The first one was Overview, Mention-
ing all the details regarding the data and its source and the second sheet consisted
of actual yearly registration data. So, we removed the Overview sheet using the R
code(Appendix).The data was published in on April 2018 on Statista. This data in-
corporates Three columns Year, Cars, Motorcycles
Figure 2: Source 1
Statista Source: https://www.statista.com/statistics/312594/motorcycle-and-car-regist
5. 2.3 Source 3: Wikipedia and Road Safety Authorities website
(Unstructured)
My third data set I used from two sources from www.wikipedia.org and www.rsa.ie. In
whichdataisprovidedinPDFformat. FromWikipediawehaveextractedthepopulation
of United Kingdom for years 2011 to 2017. When data is extracted from Wikipedia was
not in proper format, initially to scrape this data we use htmltab library. In first step we
scrape data, weremoved some unwantedrowsbecause this Wikipedia page-maintained
data from 1938 and to completed analysis we need data from year 2011 to 2017 also we
removed some extra Column, to identify the country weadded extra Country field. After
that we replaced the column name to understand the proper data which we are using
from specific column. In second stage we removed blank rows which was in population
columns, to complete this task we used is.na function which was best fitted for this
condition,andsomeofthepopulationcolumnhavingblankspacesinbetweennumbers,
because of that we are not able to get proper integer format in SSMS, hence we removed
that spaces with gsub function which is provided by R Programming.
SeconddatawhichwedownloadedfromwebsitewasinpdfformatfromWWW.rsa.ie
this website was maintained byRSA which is Road Safety Authority of Ireland was open
website. The data which we have downloaded is in pdf format accessible to anyone from
website. That pdf maintaining various data related Accidents in all European countries.
To scrape data from PDF we use libraries like tabulizer, rJava, tidyr. Those libraries
help to get proper data from pdf, Initially we set which data we need to download from
pdf, after we scrape proper data from pdf we need to clean that data for proper use.
After that we deleted some columns to get proper data which we can use. Later then we
renamed the column name to get proper insight about columns. Some of the countries
name containing junk values, to remove this value we use separate function. To identify
the year, we added Year field for better prospect. In table we removed population and
other field which was unwanted. Final and crucial step was to delete row which having
UK related accident death data, because we are getting this data from structured data,
I deleted those rows. Lastly whole cleaned data stored in csv format file.
Wikipedia Source: https://en.wikipedia.org/wiki/Demography_of_the_United_Kingdom
PDF Source: http://www.rsa.ie/en/Utility/News/2018/Ireland-4th-Safest-EU-Country-for-
3 Related Work
In this analysis which have been conducted based on the data and reports of Road
accidentsofUnitedkingdom, andbasedonthoseanalysis,actions are alreadytaken and
authority getting actions to prevent Road accidents. As we discussed in abstract Road
traffic safetyis oneof themain Concern for department of transport also for anycountry
citizens. Also as published in In 1998, road traffic injuries were estimated to be the 9th
leading cause of loss of healthy life globally and are projected to become the 3rd leading
causeby2020,The majorityofthisburden can be located inthedevelopingworldwhere
most of the projected Increase will occur Ghaffar et al. (2002).
Q1How have these (or similar) datasets been used before?
Q2What is generally known about the domain within which your requirements (in
Section 1) are situated?
6. Q3What significant results exist in this area, and how to you expect to add to them
by undertaking this project?
See ? for an example of a lit review that looks at specific challenges and approaches
within a given area.
4 Data Model
Inthissection, provide details of your dimensions, whytheyare in your datamodel, how
sources of data contribute to each of the dimensions, and present your star schema.
Adatamodel refers to thelogical inter-relationshipsanddata flow between different
dataelementsinvolvedintheinformationworld. Italsodocumentsthewaydataisstored
and retrieved. Data models facilitate communication business and technical development
byaccuratelyrepresentingtherequirementsoftheinformationsystemandbydesigning
the responses needed for those requirements. Data models help represent what data is
required and what format is to be used for different business processes What is a Data
Model? - Definition from Techopedia (n.d.). The reasons behind Kimballs Approach is,
it is easier to extend the data warehouse as it can easily accommodate new business
units. It is just creating new data marts and then integrating with other data marts
(n.d.a).Moreover, using Kimballs approach data storage is not an issue as the space
required is less which makes the data warehouse to process the queries much faster as
compared to Inmmons approach Yessad & Labiod (2016).
To achieve desire, need out of my dataset I connect my data with each other based
on unique factor they consist, so my primary data set 2.1 which consisting all detail
information about Road accidents is joined with my secondary data 2.2 statista based
on year and country. As well as my primary dataset 2.1 connected to unstructured 2.3
data on same factor to connect death count with countries and year. Based on these
datasetsIhavecreated3DimensionwhichisDimYear,DimCountryandDimRoadDeath
are explainedbelow:
DimYear: This dimension is created to holds data related to year which having in all
data source. In y structure dataset which consist the information about all Accidents
incidents by year, hence, to identify number of casualties year will be my main factor.
Same as Structure dataset my statista dataset 2.2 contained Car and bike registration
informationyearwise. hence,YearplayingoneofthemainfactorsinmyDatawarehouse.
The attribute which is used in this Dimension is YearID and Year, in which YearID is
generated using SSIS and used as Primary key in Year dimension.
DimCountry: In my structured and statista data source all county that we have com-
mon is United Kingdom by unlike these my Unstructured 2.3 data source contain all
European countries which I want to combine with Structure and statista, hence I have
created Country Dimension in which i added all the countries from my all sources. In
DimCountry I used CountryID, CountryCode and Country, CountryID is used as Pri-
mary key and generated using SSIS.
DimRoadDeath: this dimension consists of all the road accident count which I required.
WhilecreatingthesedimensionsIhavedeletedUnitedKingdomsRoadDeathcountfrom
Unstructred datatset 2.3 instead of unstructured I used my structure dataset 2.1 to get
the Road death count for each year from 2011 to 2017. This dimension consists Road-
DeathID, CountryCode and RoadDeath attributes, RoadDeathID is primary key for this
Table and Genrated by SSIS.
7. FactTable: The factTable which i made contain all the essential keys of the measure-
ment. FactTable plays essential role in setting up my business needs because it contain
all the aggregated values that I need to explain my BI queries. The attributes which I
used in FactTable in listed below: YearID: Primary Key of DimYear, used as Foreign key
in FactTable.
CountryID: Primary Key of DimCountry, used as Foreign key in FactTable
BikeRegistration: holds the Bike Registration Count for year 2011 to 2017.
RoadDeathID: Primary Key of DimRoadDeath, used as Foreign key in FactTable
Casulties: Casulties contain all the Death Count, which happened in all countries.
Population: Population holds data about Population count for each year.
CarRegistration: Contains Car Registration Count for year 2011 to 2017.
Figure 3: Star Schema
8. 5 Logical Data Map
Below Logical Data Map used to show desired star schema, In this Data Map I have explained all dimensions and Fact table that I have
used and how to transformed before loading.
Table 3: Logical Data Map describing all transforma-
tions,sourcesanddestinationsforallcomponentsofthe
data model illustrated in Figure ??
Source Column Destination Column Type Transformation
1,2,3 Year DimYear Year Dimension Converted in integer from char format, collected dis-
tinct years from all 4 data sources.
1,2,3 CountryCode DimCountry CountryCode Dimension Collected all distinct countries from each source and
used in DimCountry Dimension $
1,2,3 Country DimCountry Country Dimension Collected all distinct countries with the reference from
countrycode from unstructred data and structured
source and used in DimCountry Dimension
1,3 Road Death
for Year2017
DimRoadDeath RoadDeath Dimension collected all Road casulties from PDF except United
Kingdom, Death Casualties for United Kingdom col-
lected from Structured data source
3 ISOCode DimRoadDeath CountryCode Dimension usedtolinkCasualtiesofspecificcountriestorespective
countries.
3 Population FactTable Population Fact Collected population from Wikipedia for United King-
dom for year 2011 to 2017.
2 CarSales FactTable CarRegistrationFact Data Collected to retrieve Car Registration count from
year 2011 to 2017.
2 BikeSales FactTable BikeRegistrationFact DataCollectedtoretrieveBikeRegistrationcountfrom
year 2011 to 2017.
9. 6 ETL Process
ETL(Extract, Transform and Load) The process of extracting data from multiple source
systems,transformingittosuitbusinessneeds,andloadingitintoadestinationdatabase
is commonly called ETL, which stands for extraction, transformation, and loading
(n.d.b). ETL Process is Core process of any Data Warehouse Sysytem. Initially to Con-
struct a performance data warehouse the data is extracted from various sources. Next it
must be transformed and cleaned as per the requirement keeping in mind to remove all
kind of duplicate data, And then Load into Data warehouse, To complete my Project I
used below ETL Process.
Figure 4: Automated ETL Process
6.1 Extraction
First Structured Data downloaded from Kaggle, there was total 34 columns in data.
Which contained Accident Index, NumberOfCasulties, Speed Limit, and many other
fields. This is my primary data which consist of overall information about accident
occurred in United Kingdom from 2005 to 2017. Tocomplete this project mainly I need
data from year 2011 to 2017, hence removed excess Rows which I dont want with R Pro-
gramming Language. My second dataset was from statista, usually statista data comes
in proper cleaned format but carry multiple sheets in excel file. To remove and alter
field from statista data first thing I need to do is read that excel file in R, hence I used
readxl library to read file and provided sheet name that I wanted to read. Second sheet
containing data about total years from 2011 to 2017 with Total number of Motorcycles
and Car registered for respective years. Formy third dataset I used data from Wikipedia
and Road safety authority Annual report which was publically open to download for
10. anyone,finding out proper data from Wikipedia waseasy task but scraping that, specific
data and alter that wasnt that flexible. Raw data containing lots of unwanted fields like
fertility rate, Natural change to remove this fields I used transformation process. Same
like Wikipedia my second source for unstructured dataset was PDF which having lots of
data regarding Road accidents count all over European union, to find proper table from
pdf extract tables function from tabulizer really saved lots of time.
6.2 Transformation
After data is extracted, it must be physically transported to the target destination and
convertedintotheappropriateformat. Thisdatatransformationmayincludeoperations
such as cleaning, joining, and validating data or generating calculated data based on
existing values Kimball & Ross (2011). Initially I find out all important fields which I
need to keep and which I need to be removed. I used Readxl library to read my CSV
formatted file and allocated to variable, after that all changes has been done in that
variable only. Toremove excess rows which belongs from 2005 to 2011 I need to identify
thatrows,forthatusedgreplfunctionwithyearfieldthenusedexclamatorymarkinfrom
of grepl function so I can remove the records which containing year 2005 to 2010. After
that Ineed country fieldto identify the country,so Icreated new fieldusing country code
UK. Then I checked whole database for NULL values and blank spaces, in speed limit
fieldhadsomebankvalueswhichwascreatingirregularitywhileuploadingdatainSSMS,
hence I omitted all the blank and null values from the dataset with using function !is.na.
My second data asset was from statista, usually statista data comes in proper cleaned
format but carry multiple sheets in excel file. Toremoveand alter field from statista data
first thing I need to read that excel file in R, hence I used readxl library to read file and
provided sheet name that I wanted to read and insert all those rows and columns into
variable. After that I removed all unwanted rows in R so I can get proper content that
I need for data warehousing project. To identified data with respective country I added
country field, and finally write that data in csv formatted file. Formythird dataset I used
data from Wikipedia and Road safety authority Annual report which was publicly open
to download for anyone. In Transformation process I used htmltab library which helps
me to scrape proper data from Wikipedia. For cleaning process again !is.na function I
used. In this dataset some numeric values had spaces in-between, to find that spaces I
used gsub function which became very handy and easy to use. To write all cleaned data
in csv I used write.csv() function for proper output. Last but not the least data was to
scrape from PDF which was downloaded from pdf which is publicly available on Road
safety authority website which is maintained by Ireland RSA Team. Finding out proper
information wascrucial task andfor that Iused tabulizer, rJavaandtidyr librarieswhich
reduce my work. extract tables function from tabulizer I used to get proper data from
specific page, to get exact information even I used data coordinates to extract data from
pdf. The data which I got was raw data, hence cleaning data was main task for me. Some
of the column was having junk values in it, so using separate function from tidyr library
really saved a lot of time. this data containing United Kingdom death rate which was
already available in my structured data, hence we removed that row so I can get that
data from structure dataset.
A. Total number of libraries I used in cleaning process:
11. 1. Readxl
2. Tabulizer
3. rJava
4. tidyr
5. htmltab
B. Total number of functions I used for cleaning process:
1. !grepl
2. !is.na
3. extract tables
4. data.frame
5. read excel
6. read.csv
7. write.csv
8. separate
9. setdiff
6.3 Load
The final step in the ETLprocess involvesloading thetransformed data into the destina-
tion target The final step in the ETL process involves loading the transformed data into
the destination target??.
To load cleaned and transformed data into Database through SSMS (SQL Manage-
ment Studio), I need to create a new database on which I can create data Warehouse,
the database I named for my project is UK which is created in SSMS. After creating
Database, we need to load all data in SSMS through SSIS (SQL Server Integration ser-
vice) on Staging Area. To complete this process, I used Data flow task which help to
completedataflowprocess. UnderdataflowtaskIusedflatfilesourcecomponentwhich
helps to load data from CSV file and carry that data SSMS using OLEDB destination
component. OLEDB component link data into SSMS and helps to create Table in SSMS
and load data into that table. In Flat file Source component need to set flat file con-
nection manager, in which we need to set name in connection manager name. after that
need to select file location from System. Once we set the file need to edit text qualifier
from NONEto Quotation marks which is depending upon thewhich separator isused in
Flatsourcefile. Intheleftpaneundercolumnssectionuabletopreview 100rowsofyour
Flat source file. Below Columns under the Advance row SSIS provides ability to change
data types for various field in Data Type option, same as data type OutputColumnWidth
helps to set character width for specific columns. Once done with this setting, we can
movetosomeSettingonOLEDBDestination. InsideOLEDBdestinationUnderconnec-
tion manager we can set the database in which we want to insert those values. We can
write a SQL query to create a new table in SSMS or SSIS provide us Accurate suggestion
to create table based on Flat file source. To create new table, we need to click on new
button, once create table query appears, we can change the table name as we want, I
12. used Flat file name so I cannot get confuse while creating dimension table. In left pane
wecan check the data mapping for columns from Flat file source to OLEDB destination.
Once westart the process in SSMS table will be get create inside the SSMS. Under single
Data flow task, we can Run multiple data flow task. For my project I had total 4 flat
files, so I used 4 data flow task components for each file.
After Files getting uploaded in SSMS we need to create Dimension table. Dimension
tables provide the context for fact tables and hence for all the measurements presented
in the Datawarehouse. Although dimension tables are usually much smaller than fact
tables, they are the heart and soul of the data warehouse because they provide entry
points to data Kimball & Ross (2011). To create Dimensions, I used SQL programming
language, with the help of SQL I created 2 Dimensions which was Country and Year. In
SSIS I used Execute SQL Task Component to run sql query, in that component under
construction section select SSMS instance and Database name and SQL statement used
towriteSQLquerythroughwhichIcreatedDimensiontables,whicharegoingtohelpme
tocreateFacttable. Facttablesholdthemeasurementsofanenterprise. Therelationship
between fact tables and measurements is extremely simple. If a measurement exists, it can
be modeled as a fact table row ??. To Create fact table again I used SQL Programming
language, creating Fact table was easy task because in fact table we only insert facts
that means numerical values but populating fact table with proper data was quite hard
and interesting job to perform, while populating fact table lots of duplicate data was
occurring in table and getting proper data in fact table was abstruse task. Using various
type of joins helps to populate proper data in fact table.
After creating Fact table its important that your fact table should interact with Di-
mension tables for that I have created Constraint to connect Fact table with Dimension
table. After that CUBE deployment was main task remaining to perform, because after
successful cube deployment we can able to see the desired Star schema. To complete
this process, I selected new cube from cube wizard and selected my existing Fact table
and dimension table, under fact table selected required measures which is required to
be perform for BI queries. Then I named my cube as UK and completed wizard. As
soon as wizard completed the star schema appeared with all chosen measures, to check
that fact table is populated or not I used explore data option after that displayed all
the measures with values which are required to perform BI Queries. I checked all the
values are properly showing or not and after all those processes I can say that my cube
is successfully deployed.
7 Application
In the above segment we saw fruitful Cube Deployment which we be utilizing to answer
our business inquiries in this segment. Following are the essential business inquiries which
I thought will be extremely helpful to survey the requirements which are talked about in
Section 3. The aftereffects of these quires as for the past related deal with the subject,
whichwasreferencedinsegment3,areexaminedindetailinarea7.4. Presently,givesusa
chancetoassessthe3businessquestionsandtheiroutcomestoaddressanddemonstrate
the attainment of your business requirements 1.
13. 7.1 Does population of United Kingdom has any effect on Ca-
sualties throughout the year from 2011 to 2017 ?
For this query, the contributing sources of data are data source 2.1 and 2.3 Here we can
see that population is expanding subsequently from year 2011 towards to 2017. But if
we see there is no pattern in Casualty graph. At 2011 road casualties was at the highest
point with count 1797, but then it started falling at year 2013 casualty count was at least
point with 1608. As per our data casualties count for year 2017 is 1676 which lower that
2011 but higher than 2013. So from the graph we can say that population count does
not have any relation with number of road casualties for any year Figure 7.1.
Figure 5: Result of BI Query 1
14. 7.2 BI Query 2:To Identify that the number of Casualties have
any effect on new Bike and Car Registration in United
Kingdom?
For this query, the contributing sources of data are source 2.1 and 2.2
AsGraphillustrate that casualties dont haveanyproperformat top most death count
as per graph is 1797 which was in 2011 but car and Bike registration Count was 28467
and1238respectivelyforsameyear. Eventhoughcasualtiesgraphisgoingupanddown,
Cars and bikes registration rate is increasing steadily every year. so, we can conclude
that united kingdom population is more intend to buy new vehicles by neglecting Road
Accidents. Figure ??. Figure 7.2
Figure 6: Result of BI Query 2
15. 7.3 BI Query 3:To Find that in the count of number of ca-
sualties at what rank United Kingdom stands with other
European Union countries for year 2017.
For this query, the contributing sources of data are source 2.1 and 2.3
here we can see from world map representing that United Kingdom is on 7th place
onEuropeanunionroadcasualtycount,whetherFrancestandson1stnumberwith3448
Casualty count and Malta having least no. of casualties with 19 numbersFigure 7.3.
Figure 7: Result of BI Query 3
16. 7.4 Discussion
Till now we are prepared with Result. Now its time to discuss the observation, so lets
discuss the BI Query 1[7.1] in details, this query provide us the stats for the death
casualties. even though population is increasing rapidly road casualties still going up
anddown, to reduce thecasualty countneedto improveroaduser behaviour. Improving
road user behaviour is fundamental to reducing road traffic injuries and fatalities. It
is one of five key pillars of the Global Plan for the Decade of Action for Road Safety
2011 2020 (alongside better road safety management, safer road networks, safer vehicles
and improved post-crash response) Organization et al. (2016). Street client conduct can
be improved by street well-being efforts, which in blend with social measures (e.g., law
implementation, instruction or preparing), can turn into an amazing method to induce the
general population to carry on more securely in traffic. The Global Plan for the Decade
ofActionisestablished intheSafeSystem approachwhichtendsto chanceelementsand
negotiation influencing street clients, vehicles and the street condition in a coordinated
manner, empowering increasingly powerful aversion. This methodology is known to be
fitting and compelling in settings around the world.
Secondqueryprovideusfactorswhichinformthatpeoplesareintendingtobuymore
vehicles for transportation. As per the report Over the past few decades regulation
and consumer demand have led to increasingly safe cars in high-income countries/ areas.
Many of the features that began as relatively expensive safety add-ons in high-end vehicles
have since become much more affordable and are now considered basic requirements
for all vehicles in some countries/areas. Rapid motorization in low and middle-income
countries/areas,where therisk of aroadtraffic crash ishighest andwheremotor vehicle
productionisincreasingintandemwitheconomicgrowth,meansthereisanurgentneed
for these basic requirements to be implemented globally Organization et al. (2017). To
reduce the death by vehicles, it is important to ensure that the vehicles design stick to
recognizedsafetystandards,butintheabsenceofsuchstandardsautomobilecompanies
can sell obsolete designs that are no longer legal in well-regulated countries. Alternatively,
automobile companies frequently de-specify life-saving technologies in newer models sold
in countries where regulations are weak or neglected.
8 Conclusion and Future Work
The result which we got from observation that help us to understand the Road death
casualtys status in United Kingdom from year 2011 to 2017, as per graphs new car
and bike registrations count does not depending Road accidents or not even road death
depending upon Countries population. Even we able to reduce some Accidents with
previous studies but did we able to improve our road safety systems? And we really
shaping our system to accident less cities? I tried to build system in which I combine
data from various factors and co relate the for better outcome. What I observed out
of graphs is Road accidents are damages peoples with different level like slight, serious
and fatal. In this Datawarehouse we covered only those count which had fatal accident
severity, with the addition of more detail data we can able make more dimension and
get detail insight about accident reasons. wecan check what wasthe drivers perspective,
road or Weather conditions, or maybe vehicle conditions can really help us to prevent
or reduce the total number of casualties. Because of this limitation in this database we
were not able to find the proper cause and conditions for accidents. I need toconsider
17. this for future aspects so we can build more robust system. In which we can relate and
identifymorecauseofroadaccidents,withlocations,weatherorroadconditions,vehicle
status, drivers perspective or maybe age, gender, as well as economical loss these points
could lead us to reduce Road accidents count.
References
(n.d.a).
(n.d.b).
Ghaffar, A., Hyder, A. A., Bishai, D. & Morrow, R. H. (2002), ‘Intervention for con-
trol of road traffic injuries: review of effectiveness literature’, JOURNAL-PAKISTAN
MEDICAL ASSOCIATION 52(2), 69–72.
Jacobs,G.&Aeron-Thomas,A.(2000),‘Areviewofglobalroadaccidentfatalities’,Paper
commissioned by the Department for International Development (United Kingdom) for
the Global Road Safety Partnership .
Kimball, R. & Ross, M. (2011), The data warehouse toolkit: the complete guide to dimen-
sional modeling, John Wiley & Sons.
Organization, W. H. et al. (2016), ‘Road safety mass media campaigns: A toolkit’.
Organization, W. H. et al. (2017), ‘Save lives: a road safety technical package’.
Road traffic safety (2019).
URL: https://en.wikipedia.org/wiki/Roadtraff icsafety
What is a Data Model? - Definition from Techopedia (n.d.).
URL: https://www.techopedia.com/definition/18702/data-model
Yessad, L. & Labiod, A. (2016), Comparative study of data warehouses modeling ap-
proaches: Inmon,kimballanddatavault,in‘2016InternationalConferenceonSystem
Reliability and Science (ICSRS)’, IEEE, pp. 95–99.
Appendix
R code example
#imporingInstalledLibrary library(
readxl ) #settingWorkingDirectory
setwd("E:/Final.Project/Statista")
#fetchingdatafromexcelintoSales
Sales <- data.frame(read_excel("UK.Car.and.Bike.Sales.xlsx", sheet ="Data"))
#DeletingFirstRows
Sales = Sales[-1:-2,]
#ChangingColumnsName
18. colnames( Sales) <-c("Year","Car.Sales","Bike.Sales")
#DeletingunwantedRow
Sales = Sales[-(1:11),]
#AddingNewField
Sales$Country <-’UK’
#writinginCSV
write.csv(Sales ,"Statisa_Sales.csv",row.names= F)
#ImportingInstalledLibrary
library( tab u lizer )
library( rJava )
library( tidyr )
#Setw orkingL ibrary
setwd("E:/Final.Project/Unstructure")
#load1sttablefrmPDF
ratio <-extract_tables("PIN_ANNUAL_REPORT_2018_final.pdf", pages =31, output ="da
#load2ndtablefrmPDF
country <-extract_tables("PIN_ANNUAL_REPORT_2018_final.pdf", pages =28, output ="
ratio <- ratio [[1]]
country <- country [[1]]
#toD eleteC olum ns
ratio <- ratio [, -c(5:8)]
#A ssignNametoC olum n
names(ratio) <-c("Country","Road_Death_for_Year_2017","Inhabitants","Death.per.Mil
#DeleteRows
ratio = ratio [ -(1:5) ,]
#seperateJunkValuesfromRow
ratio <- separate(ratio ,"Country", into =c("CountryCode","star"))
#DeleteJunkValuescolumn
ratio <- ratio [, -2]
#AddExtraYearField ratio
$ Year <-’2017’
#ratio$Country<-country$Country[match(ratio$Country,country$ISO.Code)]
ratio <- ratio [!( ratio$Country =="United.Kingdom"), ]
ratio <- ratio[setdiff(colnames( ratio),c(’Inhabitants’,’Death.per.Million.Inhabi
#DeleteRowswithUKName
ratio <- ratio [! grepl("UK", ratio$CountryCode ),]
#WriteallvaluesinCSV
write.csv(country ,"CountryName.csv",row.names= F)
write.csv(ratio ,"2018_Annual_Report.csv",row.names= F)
########################_ScrapingFromWikipedia_###############################
#SetworkingLibrary
setwd("E:/Final.Project/Unstructure")
library( htmltab)
url<-"https://en.wikipedia.org/wiki/Demography_of_the_United_Kingdom"
Population <- htmltab(doc =url,which=37)
Populatio n <- Populatio n[, -c(3:9)]
Population$Country <-’UK’
names( Population) <-c("Year","Population","Country")
19. Population <- Population [!is.na( Population$Population), ]
Populatio n = Populatio n[-(1:40),]
#Population$Population<-trim(Population$Population)
Population$Population <-gsub(".","",Population$Population )
write.csv(Population ,"population.csv",row.names= F)