This document describes a project to aggregate data from various sources about events and traffic conditions, and visualize that data to help explain abnormal traffic patterns. Data is collected from APIs providing information about scheduled events, weather, traffic incidents, and real-time traffic flow. The data is stored in a database and can be visualized on a map interface, allowing users to search for events within a given location and time range. The goal is to help analyze reasons for congestion and support future traffic prediction and analysis.
Eleven police detachments provide service to over 160,000 residents in the Columbia Basin region of British Columbia, which covers the West and East Kootenay areas. According to spatial datasets from DataBC, the Central Kootenay Regional District saw the largest increase in policing service population from 2004 to 2014, while the East Kootenay and Columbia Shuswap regional districts saw more modest increases. The data is freely available online from DataBC under an open license but some datasets are updated irregularly and did not align easily, requiring additional processing to spatially display population information.
Classification Approach for Big Data Driven Traffic Flow Prediction using Ap...IRJET Journal
This document discusses a proposed system for predicting traffic flow using big data and classification approaches. The system uses K-Nearest Neighbors (KNN) classification to identify traffic patterns and routes. It then uses a Convolutional Neural Network (CNN) to predict traffic flow levels on particular routes. The KNN identifies travel times between locations while the CNN predicts flow levels. The proposed system is evaluated using metrics like root mean squared error and mean relative error, and is found to improve accuracy and reduce prediction time compared to existing methods. The system aims to provide route recommendations to users based on minimum predicted traffic flow.
A Study on an Urban Safety Index Model Using Public Big Data and SNS Data영원 서
The document presents a study on developing an urban safety index model using public data and social media data. The study aims to analyze various public data sets related to safety like weather, traffic, fires, and crimes to calculate a safety index for different regions. It also collects and classifies tweets related to disasters to include social data. The model is evaluated using different machine learning algorithms, with neural networks achieving the best accuracy. A web-based monitoring system is designed to display the safety index and real-time social data on a map for users. Future work includes improving the classification and collecting more granular public data through APIs.
Viterbi optimization for crime detection and identificationTELKOMNIKA JOURNAL
In this paper, we introduce two types of hybridization. The first contribution is the hybridization between the Viterbi algorithm and Baum Welch in order to predict crime locations. While the second contribution considers the optimization based on decision tree (DT) in combination with the Viterbi algorithm for criminal identification using Iraq and India crime dataset. This work is based on our previous work [1]. The main goal is to enhance the results of the model in both consuming times and to get a more accurate model. The obtained results proved the achievement of both goals in an efficient way.
The document discusses implementing an Intelligent Transportation System (ITS) in Moscow using SAP HANA to help address the city's traffic problems. It describes how SAP HANA could collect and analyze data from different sources in real-time to better manage traffic and improve quality of life. This would reduce road accidents by 50% and lower infrastructure costs by 10%. The document also examines how SAP HANA compares favorably to other ITS systems and could support various transportation services. Potential risks of the project include incompatibility with existing systems and low levels of inter-departmental communication.
The document discusses integrating sensor and social data to understand city events. It describes collecting data from multiple sources, including sensors and social media. Statistical models are used to analyze the sensor data and identify anomalies, which are then correlated with events extracted from social media using spatial and temporal proximity. The approach is evaluated on traffic data from San Francisco, integrating data from traffic sensors and Twitter to extract and corroborate traffic events.
IRJET- Projecting Climate Impacts on Transportation by Diagnosing and Exa...IRJET Journal
This document discusses a proposed system to project the impacts of climate change on transportation in metropolitan cities by integrating real-time traffic updates and weather forecasts. It aims to develop a web application using the Google Maps JavaScript API and OpenWeatherMap API to obtain live traffic and weather data for 4 major cities. The system would analyze how traffic levels change with respect to weather and demonstrate this relationship through a graph. It would also send weather reports to users by email. The goal is to help authorities better understand climate impacts on transportation systems and reduce issues like pollution, flooding and inadequate rainfall.
Online Bus Arrival Time Prediction Using Hybrid Neural Network and Kalman fil...IJMER
This document presents a hybrid method for predicting bus arrival times using neural networks and Kalman filters. The proposed method combines a neural network trained on historical bus location and travel time data to make initial predictions, and then uses a Kalman filter to continuously update the predictions based on real-time GPS measurements from buses. The neural network model uses seven input nodes and a double hidden layer structure. The Kalman filter equations are used to fuse the neural network predictions with current GPS observations to improve prediction accuracy over time. A case study on a real bus route in Egypt showed the hybrid method achieved satisfactory prediction accuracy.
Eleven police detachments provide service to over 160,000 residents in the Columbia Basin region of British Columbia, which covers the West and East Kootenay areas. According to spatial datasets from DataBC, the Central Kootenay Regional District saw the largest increase in policing service population from 2004 to 2014, while the East Kootenay and Columbia Shuswap regional districts saw more modest increases. The data is freely available online from DataBC under an open license but some datasets are updated irregularly and did not align easily, requiring additional processing to spatially display population information.
Classification Approach for Big Data Driven Traffic Flow Prediction using Ap...IRJET Journal
This document discusses a proposed system for predicting traffic flow using big data and classification approaches. The system uses K-Nearest Neighbors (KNN) classification to identify traffic patterns and routes. It then uses a Convolutional Neural Network (CNN) to predict traffic flow levels on particular routes. The KNN identifies travel times between locations while the CNN predicts flow levels. The proposed system is evaluated using metrics like root mean squared error and mean relative error, and is found to improve accuracy and reduce prediction time compared to existing methods. The system aims to provide route recommendations to users based on minimum predicted traffic flow.
A Study on an Urban Safety Index Model Using Public Big Data and SNS Data영원 서
The document presents a study on developing an urban safety index model using public data and social media data. The study aims to analyze various public data sets related to safety like weather, traffic, fires, and crimes to calculate a safety index for different regions. It also collects and classifies tweets related to disasters to include social data. The model is evaluated using different machine learning algorithms, with neural networks achieving the best accuracy. A web-based monitoring system is designed to display the safety index and real-time social data on a map for users. Future work includes improving the classification and collecting more granular public data through APIs.
Viterbi optimization for crime detection and identificationTELKOMNIKA JOURNAL
In this paper, we introduce two types of hybridization. The first contribution is the hybridization between the Viterbi algorithm and Baum Welch in order to predict crime locations. While the second contribution considers the optimization based on decision tree (DT) in combination with the Viterbi algorithm for criminal identification using Iraq and India crime dataset. This work is based on our previous work [1]. The main goal is to enhance the results of the model in both consuming times and to get a more accurate model. The obtained results proved the achievement of both goals in an efficient way.
The document discusses implementing an Intelligent Transportation System (ITS) in Moscow using SAP HANA to help address the city's traffic problems. It describes how SAP HANA could collect and analyze data from different sources in real-time to better manage traffic and improve quality of life. This would reduce road accidents by 50% and lower infrastructure costs by 10%. The document also examines how SAP HANA compares favorably to other ITS systems and could support various transportation services. Potential risks of the project include incompatibility with existing systems and low levels of inter-departmental communication.
The document discusses integrating sensor and social data to understand city events. It describes collecting data from multiple sources, including sensors and social media. Statistical models are used to analyze the sensor data and identify anomalies, which are then correlated with events extracted from social media using spatial and temporal proximity. The approach is evaluated on traffic data from San Francisco, integrating data from traffic sensors and Twitter to extract and corroborate traffic events.
IRJET- Projecting Climate Impacts on Transportation by Diagnosing and Exa...IRJET Journal
This document discusses a proposed system to project the impacts of climate change on transportation in metropolitan cities by integrating real-time traffic updates and weather forecasts. It aims to develop a web application using the Google Maps JavaScript API and OpenWeatherMap API to obtain live traffic and weather data for 4 major cities. The system would analyze how traffic levels change with respect to weather and demonstrate this relationship through a graph. It would also send weather reports to users by email. The goal is to help authorities better understand climate impacts on transportation systems and reduce issues like pollution, flooding and inadequate rainfall.
Online Bus Arrival Time Prediction Using Hybrid Neural Network and Kalman fil...IJMER
This document presents a hybrid method for predicting bus arrival times using neural networks and Kalman filters. The proposed method combines a neural network trained on historical bus location and travel time data to make initial predictions, and then uses a Kalman filter to continuously update the predictions based on real-time GPS measurements from buses. The neural network model uses seven input nodes and a double hidden layer structure. The Kalman filter equations are used to fuse the neural network predictions with current GPS observations to improve prediction accuracy over time. A case study on a real bus route in Egypt showed the hybrid method achieved satisfactory prediction accuracy.
Project on nypd accident analysis using hadoop environmentSiddharth Chaudhary
For this project NYC motor-vehicle-collisions dataset is processed in Hadoop ecosystem using map reduce, Pig script and Hive query for analysis and visualization.
The document describes the features of Trisul Network Analytics Traffic Analyzer solution. It provides network visibility, performance monitoring, and security capabilities to network operations, security operations, and architecture/planning teams. The solution monitors network traffic through routers and switches to analyze routing efficiency, application and user activity, bandwidth usage, latency, security threats, and more. It includes customizable dashboards and reports to gain insights for troubleshooting, capacity planning, traffic management, and IPv6 transition monitoring.
This document presents an approach for generating valuable traffic density data to simulate route planning for patrol cars. It involves extracting location data from GPS and tracking devices of patrol cars over time. This data is used to calculate route frequencies, which are then encoded with color to represent density on a map. The route density data is then correlated with crime hotspot information to propose a new route planning simulation for law enforcement. This aims to more efficiently dispatch patrol cars by considering both traffic patterns and crime trends.
Certain Analysis on Traffic Dataset based on Data Mining AlgorithmsIRJET Journal
The document analyzes a traffic accident dataset using data mining algorithms to identify patterns and relationships that can provide safe driving suggestions. It applies association rule mining, classification using naive Bayes, and k-means clustering. The analysis finds that human factors like being drunk or collision type have a stronger effect on accident fatality than environmental factors. Clustering identifies regions with higher or lower fatality rates. Integrating additional data could enable more testing and safety suggestions.
This document discusses using data mining techniques to predict flight delays. It begins with an introduction discussing the growing issue of flight delays costing billions of dollars. It then discusses previous work applying classification algorithms like KNN, decision trees, and neural networks to flight delay data. The document focuses on applying these techniques to datasets containing over 1 million flights from January 2017 and 2018 with 60 features related to delays. It analyzes the performance of KNN, C5.0 decision trees and neural networks at predicting arrival delays, finding decision trees to be most accurate at 85%.
1) NavInfo is a leading location big data company in China that collects real-time traffic and road conditions data from over 30 million vehicles and other sources daily.
2) NavInfo provides traffic data and analytic services to automakers, governments, and enterprises through products like real-time traffic maps, traffic predictions, and local hazard warnings.
3) NavInfo has built a location big data platform called MineData that integrates massive traffic and road data to provide tools for visualization, spatial analysis, and custom location services.
This document proposes a web service framework to visualize sensor data from multiple sources in real-time to help with disaster management decision making. The framework includes collecting raw sensor data, processing and storing it in a database, mapping the sensor locations on a map display like Google Maps, and regularly updating the visualizations. It demonstrates the framework by visualizing real-time Taipei bus location and speed data to indicate road conditions after a disaster occurs. The framework is intended to help coordinate response resources across organizations and provide situational awareness for decision makers.
procedure for crime prevention and environmental degradation.pptxIsMaiRa2
The CIRAS (Crime Information Reporting and Analysis System) is an information system used by the Philippine National Police (PNP) to digitize crime reporting and analysis. It allows PNP units to enter and update crime incident records, generate statistical reports, and visualize crime data through maps and graphs. The system provides crime data that can be used for decision making, policymaking, and improving PNP performance. It centralizes crime information, standardizes crime reporting, and aims to reduce crime through data-driven policing.
This document discusses a system for mining traffic data using GPS-enabled mobile phones in a mobile cloud infrastructure. The system has three main components: a client interface on mobile devices, a server process, and cloud storage. The client filters GPS data from mobile devices to identify motorized transportation modes. This data is sent to the server, which uses distance-based clustering to group devices on the same vehicle. The clustered data and historical data are stored in the cloud for traffic detection. This mobile cloud approach reduces burdens on mobile devices and servers while leveraging cloud resources.
This content describes Call Detail Records (CDR) data format, data acquisition method, visualize in Mobmap and the applications for disaster management.
Decentralized system to compute safest route - ReportAnushka Patil
Designed and implemented a backend application to show people how to avoid dangerous spots on city streets while walking from one place to another. Here we are providing the paths that offer trade-offs between safety and distance. We have developed an algorithm that would give a person walking through a city options for getting from one place to another — the shortest path, the safest path that balanced between both factors.
Project url: https://github.com/anushkaaaa/-Decentralized-system-to-compute-safest-route
Smart City Surveillance Running on VehiclesMa'ayan Doron
This document summarizes a smart city surveillance system that uses vehicles to collect and share data. The system was developed with support from Indiana University-Purdue University Indianapolis and funding from the National Science Foundation and Department of Defense. In its current state, the system allows a server to accept client connections and queries and save information to a MySQL database. The server can also respond to clients and dictate vehicle target locations. Future work aims to improve client-server communication and allow the server to set trajectories for data collection in requested areas.
Mr. Paul Chang's presentation at QITCOM 2011QITCOM
QITCOM 2011
Presentation:
City Operations Centre for Managing City
Presenter:
Mr. Paul Chang - Business Development Executive for Emerging Markets, IBM
The document discusses a scenario where smoke from fires in Mexico causes high particulate matter (PM) concentrations over the eastern United States. It describes the goals of detecting and predicting the smoke plume and its impacts on air quality to help manage transportation safety. Key needs are real-time fire and smoke monitoring, PM and ozone forecasts, and analysis and delivery tools to provide information to managers and the public.
This document provides a comprehensive literature review and analysis of various traffic prediction techniques. It begins with an abstract that outlines the need for accurate traffic forecasting to address issues caused by increased road traffic. The document then reviews several existing traffic prediction methods and technologies, including fuzzy logic-based systems, intelligent traffic signal controllers, dynamic traffic information systems, and frameworks that utilize IoT, cloud computing, and machine learning. It identifies gaps in current literature, such as a lack of sensor data and advanced application frameworks for prediction. Finally, the document presents several comparison tables analyzing traffic prediction techniques based on the datasets, parameters, merits and demerits of each approach. The overall purpose is to conduct a systematic analysis of past work and identify future research
Real time deep-learning based traffic volume count for high-traffic urban art...Conference Papers
This document proposes and tests a deep learning-based system for real-time traffic volume counting on high-traffic urban arterial roads. Video clips from 4 camera views along arterial roads with estimated annual average daily traffic over 50,000 vehicles were used to test the system. The system achieved average accuracy rates between 93.84-97.68% across the camera views for 5 and 15-minute video clips. It was also able to process frames in real-time at an average of 37.27ms per frame. The proposed system provides an accurate and efficient method for traffic authorities to conduct traffic volume surveys on busy urban roads.
IRJET- Accident Information Mining and Insurance Dispute ResolutionIRJET Journal
This document proposes a system to provide a centralized database for road accident information to help with insurance claims. The system would collect data from police reports on accident victims, medical forms, and other documents. It would apply k-means clustering to analyze the data and identify high-risk locations, accident ratios in different areas, and common causes of accidents. The results would be made available to users and police authorities. Association rule learning using the Apriori algorithm would also be used to determine common factors associated with accidents. The goal is to help reduce accidents by 24% by predicting risks and notifying users.
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...IRJET Journal
This document describes a proposed system for monitoring and analyzing real-time traffic images and information using a database cloud. The system would allow drivers to share real-time traffic information by uploading photos and reports to the cloud. This shared information is compiled into a traffic digest that is then sent to other drivers to help with route planning and decision making. The proposed architecture involves a cloud-based system with storage and an application server, as well as client devices in vehicles that can upload and receive traffic information.
IRJET- Web Traffic Analysis through Data Analysis and Machine LearningIRJET Journal
This document discusses analyzing web traffic through data analysis and machine learning. It begins by introducing the need for web traffic analysis as internet usage increases. It then discusses collecting and processing web usage data, key metrics for analysis like number of users and page views, and identifying important performance indicators. The document outlines different methods of analyzing web traffic, both online and offline. It discusses using machine learning to accept web traffic data, clean it, and analyze patterns. The conclusion reiterates the importance of monitoring web traffic to provide sufficient capacity for users.
This work discusses the study and development of a graphical interface and implementation of a machine learning model for vehicle traffic injury and fatality prediction for a specified date range and for a certain zip (US postal) code based on the New York City's (NYC) vehicle crash data set. While previous studies focused on accident causes, little insight has been offered into how such data may be utilized to forecast future incidents, Studies that have historically concentrated on certain road segment types, such as highways and other streets, and a specific geographic region, this study offers a citywide review of collisions. Using cutting-edge database and networking technology, a user-friendly interface was created to display vehicle crash series. Following this, a support vector machine learning model was built to evaluate the likelihood of an accident and the consequent injuries and deaths at the zip code level for all of NYC and to better mitigate such events. Using the visualization and prediction approach, the findings show that it is efficient and accurate. Aside from transportation experts and government policymakers, the machine learning approach deliver useful insights to the insurance business since it quantifies collision risk data collected at specific places.
Project on nypd accident analysis using hadoop environmentSiddharth Chaudhary
For this project NYC motor-vehicle-collisions dataset is processed in Hadoop ecosystem using map reduce, Pig script and Hive query for analysis and visualization.
The document describes the features of Trisul Network Analytics Traffic Analyzer solution. It provides network visibility, performance monitoring, and security capabilities to network operations, security operations, and architecture/planning teams. The solution monitors network traffic through routers and switches to analyze routing efficiency, application and user activity, bandwidth usage, latency, security threats, and more. It includes customizable dashboards and reports to gain insights for troubleshooting, capacity planning, traffic management, and IPv6 transition monitoring.
This document presents an approach for generating valuable traffic density data to simulate route planning for patrol cars. It involves extracting location data from GPS and tracking devices of patrol cars over time. This data is used to calculate route frequencies, which are then encoded with color to represent density on a map. The route density data is then correlated with crime hotspot information to propose a new route planning simulation for law enforcement. This aims to more efficiently dispatch patrol cars by considering both traffic patterns and crime trends.
Certain Analysis on Traffic Dataset based on Data Mining AlgorithmsIRJET Journal
The document analyzes a traffic accident dataset using data mining algorithms to identify patterns and relationships that can provide safe driving suggestions. It applies association rule mining, classification using naive Bayes, and k-means clustering. The analysis finds that human factors like being drunk or collision type have a stronger effect on accident fatality than environmental factors. Clustering identifies regions with higher or lower fatality rates. Integrating additional data could enable more testing and safety suggestions.
This document discusses using data mining techniques to predict flight delays. It begins with an introduction discussing the growing issue of flight delays costing billions of dollars. It then discusses previous work applying classification algorithms like KNN, decision trees, and neural networks to flight delay data. The document focuses on applying these techniques to datasets containing over 1 million flights from January 2017 and 2018 with 60 features related to delays. It analyzes the performance of KNN, C5.0 decision trees and neural networks at predicting arrival delays, finding decision trees to be most accurate at 85%.
1) NavInfo is a leading location big data company in China that collects real-time traffic and road conditions data from over 30 million vehicles and other sources daily.
2) NavInfo provides traffic data and analytic services to automakers, governments, and enterprises through products like real-time traffic maps, traffic predictions, and local hazard warnings.
3) NavInfo has built a location big data platform called MineData that integrates massive traffic and road data to provide tools for visualization, spatial analysis, and custom location services.
This document proposes a web service framework to visualize sensor data from multiple sources in real-time to help with disaster management decision making. The framework includes collecting raw sensor data, processing and storing it in a database, mapping the sensor locations on a map display like Google Maps, and regularly updating the visualizations. It demonstrates the framework by visualizing real-time Taipei bus location and speed data to indicate road conditions after a disaster occurs. The framework is intended to help coordinate response resources across organizations and provide situational awareness for decision makers.
procedure for crime prevention and environmental degradation.pptxIsMaiRa2
The CIRAS (Crime Information Reporting and Analysis System) is an information system used by the Philippine National Police (PNP) to digitize crime reporting and analysis. It allows PNP units to enter and update crime incident records, generate statistical reports, and visualize crime data through maps and graphs. The system provides crime data that can be used for decision making, policymaking, and improving PNP performance. It centralizes crime information, standardizes crime reporting, and aims to reduce crime through data-driven policing.
This document discusses a system for mining traffic data using GPS-enabled mobile phones in a mobile cloud infrastructure. The system has three main components: a client interface on mobile devices, a server process, and cloud storage. The client filters GPS data from mobile devices to identify motorized transportation modes. This data is sent to the server, which uses distance-based clustering to group devices on the same vehicle. The clustered data and historical data are stored in the cloud for traffic detection. This mobile cloud approach reduces burdens on mobile devices and servers while leveraging cloud resources.
This content describes Call Detail Records (CDR) data format, data acquisition method, visualize in Mobmap and the applications for disaster management.
Decentralized system to compute safest route - ReportAnushka Patil
Designed and implemented a backend application to show people how to avoid dangerous spots on city streets while walking from one place to another. Here we are providing the paths that offer trade-offs between safety and distance. We have developed an algorithm that would give a person walking through a city options for getting from one place to another — the shortest path, the safest path that balanced between both factors.
Project url: https://github.com/anushkaaaa/-Decentralized-system-to-compute-safest-route
Smart City Surveillance Running on VehiclesMa'ayan Doron
This document summarizes a smart city surveillance system that uses vehicles to collect and share data. The system was developed with support from Indiana University-Purdue University Indianapolis and funding from the National Science Foundation and Department of Defense. In its current state, the system allows a server to accept client connections and queries and save information to a MySQL database. The server can also respond to clients and dictate vehicle target locations. Future work aims to improve client-server communication and allow the server to set trajectories for data collection in requested areas.
Mr. Paul Chang's presentation at QITCOM 2011QITCOM
QITCOM 2011
Presentation:
City Operations Centre for Managing City
Presenter:
Mr. Paul Chang - Business Development Executive for Emerging Markets, IBM
The document discusses a scenario where smoke from fires in Mexico causes high particulate matter (PM) concentrations over the eastern United States. It describes the goals of detecting and predicting the smoke plume and its impacts on air quality to help manage transportation safety. Key needs are real-time fire and smoke monitoring, PM and ozone forecasts, and analysis and delivery tools to provide information to managers and the public.
This document provides a comprehensive literature review and analysis of various traffic prediction techniques. It begins with an abstract that outlines the need for accurate traffic forecasting to address issues caused by increased road traffic. The document then reviews several existing traffic prediction methods and technologies, including fuzzy logic-based systems, intelligent traffic signal controllers, dynamic traffic information systems, and frameworks that utilize IoT, cloud computing, and machine learning. It identifies gaps in current literature, such as a lack of sensor data and advanced application frameworks for prediction. Finally, the document presents several comparison tables analyzing traffic prediction techniques based on the datasets, parameters, merits and demerits of each approach. The overall purpose is to conduct a systematic analysis of past work and identify future research
Real time deep-learning based traffic volume count for high-traffic urban art...Conference Papers
This document proposes and tests a deep learning-based system for real-time traffic volume counting on high-traffic urban arterial roads. Video clips from 4 camera views along arterial roads with estimated annual average daily traffic over 50,000 vehicles were used to test the system. The system achieved average accuracy rates between 93.84-97.68% across the camera views for 5 and 15-minute video clips. It was also able to process frames in real-time at an average of 37.27ms per frame. The proposed system provides an accurate and efficient method for traffic authorities to conduct traffic volume surveys on busy urban roads.
IRJET- Accident Information Mining and Insurance Dispute ResolutionIRJET Journal
This document proposes a system to provide a centralized database for road accident information to help with insurance claims. The system would collect data from police reports on accident victims, medical forms, and other documents. It would apply k-means clustering to analyze the data and identify high-risk locations, accident ratios in different areas, and common causes of accidents. The results would be made available to users and police authorities. Association rule learning using the Apriori algorithm would also be used to determine common factors associated with accidents. The goal is to help reduce accidents by 24% by predicting risks and notifying users.
IRJET- Monitoring and Analysing Real Time Traffic Images and Information Via ...IRJET Journal
This document describes a proposed system for monitoring and analyzing real-time traffic images and information using a database cloud. The system would allow drivers to share real-time traffic information by uploading photos and reports to the cloud. This shared information is compiled into a traffic digest that is then sent to other drivers to help with route planning and decision making. The proposed architecture involves a cloud-based system with storage and an application server, as well as client devices in vehicles that can upload and receive traffic information.
IRJET- Web Traffic Analysis through Data Analysis and Machine LearningIRJET Journal
This document discusses analyzing web traffic through data analysis and machine learning. It begins by introducing the need for web traffic analysis as internet usage increases. It then discusses collecting and processing web usage data, key metrics for analysis like number of users and page views, and identifying important performance indicators. The document outlines different methods of analyzing web traffic, both online and offline. It discusses using machine learning to accept web traffic data, clean it, and analyze patterns. The conclusion reiterates the importance of monitoring web traffic to provide sufficient capacity for users.
This work discusses the study and development of a graphical interface and implementation of a machine learning model for vehicle traffic injury and fatality prediction for a specified date range and for a certain zip (US postal) code based on the New York City's (NYC) vehicle crash data set. While previous studies focused on accident causes, little insight has been offered into how such data may be utilized to forecast future incidents, Studies that have historically concentrated on certain road segment types, such as highways and other streets, and a specific geographic region, this study offers a citywide review of collisions. Using cutting-edge database and networking technology, a user-friendly interface was created to display vehicle crash series. Following this, a support vector machine learning model was built to evaluate the likelihood of an accident and the consequent injuries and deaths at the zip code level for all of NYC and to better mitigate such events. Using the visualization and prediction approach, the findings show that it is efficient and accurate. Aside from transportation experts and government policymakers, the machine learning approach deliver useful insights to the insurance business since it quantifies collision risk data collected at specific places.
1. Submitted By: Chandrasekar Hariharan & Vaikunth Sridharan
MENTOR: PRAMOD ANANTHARAMAN
CITY TRAFFIC EVENT
AGGREGATION AND
VISUALIZING SERVICE
WRIGHT STATE UNIVERSITY
CS 7800 WEB INFORMATION SYSTEMS
2. City Traffic Event Aggregation and Visualizing Service
Page 1
`City Traffic Event
Aggregation and
Visualizing Service
Goal:
Visualize Events responsible for abnormal
traffic pattern on a given traffic scenario.
1. Abstract:
In this project, we will analyze various traffic data
sources and event data sources to better explain
the reason for any particular congestion. Data
sources are split up in to two kinds, traffic data and
event data. Congestion factor, traffic speed and
link volume are three significant attributes which
better describes traffic variations. Weather data,
Active & Scheduled events data, etc. are some
data sources which gives the events that occur at
traffic variations in terms of those attributes.
Collecting this information from various sources,
aggregating the same, updating it on database,
further creating a visualization approach for city
events for better describing abnormal traffic
behavior and using the same for further research
analysis are the key elements of this project.
2. Introduction:
Recently traffic has enormously increased in city
areas, making it tedious and drawn-out for the
public to commute [1]. Average commute rate on
bay area is 30 percent longer than it was a year
and half ago [2]. They drivers in bay area spend
more time stuck in traffic than commuting. The
classical theory of traffic flow describes two traffic
phases: the uncongested phase or free flow phase,
and the congested phase [3]. Joint measurements
of speed and flow gathered in a learning phase,
are useful in validating the traffic flow data results
[3]. There is an inverse proportion between speeds
of a traffic flow and congestion [3].
With the help of existing traffic flow data,
predicting traffic for the next 15 to 30 minutes is
80 percent accurate [4]. Constructing a knowledge
base using incident reports, weather data, event
scenarios nearby, can serve as useful source of
information for learning, constructing traffic
models and predicting traffic flow in addition to
flow data [5].
Based on the information from the above data
sources, in this project, we develop aggregation
Figure 1: Individual speed variance in traffic flow [3]
Figure 2: Required data sources for analysis [4]
3. City Traffic Event Aggregation and Visualizing Service
Page 2
and visualization mechanisms from the existing
data sources. The data sources selected,
aggregation and visualization architecture are
explained below.
3. Objectives
The goal of this project is to aggregate data from
diversified data sources based on location and
timestamp to visualize spatial and thematic data.
List of objectives to accomplish the suggested
goal is listed below:
3.1 Finding Possible Data sources
Social tweets, popular events nearby, weather in a
location, traffic incidents in a network are key data
sources which affect the traffic flow in a given
traffic link [5]. Reliability, credibility, data accuracy
are some key factors in selecting the right data.
Based on all these factors, we have chosen five
API’s which offers reliable services and accurate
information regarding the data requested. The
sources are Eventful, Eventbrite, Openweathermap
and Here maps flow and incident rest API’s.
3.2 Aggregating Data sources
These data sources collected from the above
mentioned API’s are aggregated together by time
and location. At any given instance and any given
location, aggregated table gives the values of all
the events possibly occurred which has an impact
on the traffic. In this project, we aim to aggregate
the data sources and give the results for further
investigation. Learning, forecasting the traffic and
finding the cause of an incident are out of scope.
By aggregating the data, user can search for all the
events bound to a timeline, thereby our data
aggregated, can serve as the data source for many
traffic related predictions using statistical
inferences or machine learning techniques.
3.3 Storing Thematic Data
The data obtained from the server is stored in the
database for every particular time interval. Incident
data, flow data and weather data are subjected to
change every now and then. Hence storing this
data for future analysis is a must. Writing a shell
script, which fetches the data from the API and
updating the database for particular time interval
is this objective.
3.4 Visualizing a Scenario
Visualizing the data obtained from various data
sources in the maps is this objective. The user
selects the location and corresponding timeline
for obtaining the events occurred. This request is
forwarded to the server, which in turn returns the
values collected from the server according to
user’s request. For every time range, the events
may vary. At a timeline, user can graphically view
what traffic incidents have happened, the events
occurred at that similar time when the incidents
happened. This data can be used for future
analysis to obtain any relation of parameters
collected.
4. Data Source Description
4.1 Scheduled Event Sources:
4.1.1 Eventful API
This API has world’s largest collection of events,
which ranges from local to global events. Venues
and events can be searched in this API. It uses REST
based calling scheme. A private application key
has to be requested to obtain data from this data
source.
4. City Traffic Event Aggregation and Visualizing Service
Page 3
Input Parameters:
1. Location latitude
2. Location longitude
3. From date and time
4. To date and time
Output Parameters:
Figure 3: Eventful Response Parameters
4.1.2 Eventbrite API
This website allows peoples to create and attend
events in around 190 countries. Tickets for most
of the events which occurs in big cities can be
bought through this website. This website gives
an approximation of attendance present for an
event. With this as a data source, information
about popular events occurring near the area can
be retrieved for analysis.
Input Parameters:
1. Location latitude
2. Location longitude
3. From date and time
4. To date and time
Output Parameters:
Figure 4: Eventbrite Response Parameters
4.2 Active Event Sources:
4.2.1 Open Weather API
This API gives us information about weather of a
location. Weather has major impact on traffic
conditions and incidents which could possibly
happen in road. It has also been observed that
during severe snow conditions the traffic demand
also drops significantly and the congestion on
the freeway disappears [6]. Based on the
research carried out [6], there is a significant drop
in traffic volume in snowy conditions but there
was observed a significant increase in traffic
congestion. Dampen conditions significantly
contribute to traffic congestion in peak areas [6].
Increased traffic congestion and decreased traffic
volume was observed in city areas during wet
traffic conditions [6]. Hence it is important to
consider weather data for analyzing traffic.
Input Parameter: Location
5. City Traffic Event Aggregation and Visualizing Service
Page 4
Output Parameter:
Figure 5: Open weather data Response
Parameters
4.2.2 Here Incident API
Constructions and Accidents play a major role in
affecting the flow of traffic thereby contributing to
the increased traffic congestion. Accident rate is a
complex characteristic of “road-vehicle-driver-
road environment” system [7]. From the research
[7], we could conclude that there is a direct
relationship between accident rate and traffic
volume.
From the figure 6, it can be observed that there is
a significant decrease in traffic volume when there
is increased accident density. Hence monitoring
accident data for traffic accidents gives us insights
on abnormal traffic.
Figure 6: Relationship between traffic volume and
accidents [8]
Construction events occurring in a location will
increase the traffic congestion by making it jam-
packed on a link, as it will reduce the number of
routes to destination [8]. Hence monitoring
construction events is necessary for analyzing the
traffic flow. Here API gives information about
traffic accidents and construction around a
location.
Input Parameters: Location & Bounding Box
Output Parameters:
Figure 7: Here Incidents data Response
6. City Traffic Event Aggregation and Visualizing Service
Page 5
4.3 Flow Data Sources:
4.3.1 Here Flow API
From the research made [4], traffic flow data
accounts for 80 percent of traffic prediction. This
flow data is collected from various sensors kept
at many spots in the city. Here API collects this
flow data and sends it to the person who
requests via REST calls.
Input Parameters: Location & Bounding Box
Output Parameters:
Figure 8: Here Flow API data response
Terms and Explanation
PC- Point Location Code
DE - Description of Road
QD -Queuing direction. '+' or '-'. Note
this is the opposite of the travel direction
in the fully qualified ID, for example for
location 107+03021 the QD would be '-'.
LE - Length of the stretch of road
JF - Jamming Factor
CN- Confidence, an indication of how the
speed was determined. -1.0 road closed.
1.0=100% 0.7-100% Historical Usually a
value between .7 and 1.0.
FF - The free flow speed on this stretch of
road.
SP - Speed (based on UNITS) capped by
speed limit
SU-Speed (based on UNITS) not capped
by speed limit
4.3.2 511 Data
This data source gives real time traffic information
about San Francisco Bay Area. Incident,
construction details, information about a link, real
time updates on traffic data are gathered from this
information source. From this data source, we
obtain the speed vs volume parameters which
could be a sensitive information in traffic
prediction.
Input Parameters: Location and Timestamp
Output Parameters:
Figure 9: 511 Data response
7. City Traffic Event Aggregation and Visualizing Service
Page 6
System Architecture
Complete Architecture
Below described is the complete system
architecture. Events from multiple event sources
are recorded. They are aggregated and visualized
based on time.
Data sources are classified into Event sources and
Traffic sources. Traffic data sources provide real
time traffic information and flow information in an
area. Event sources provide information about the
events which are occuring in a particular area.
Collecting data from these datasources, storing it
locally in a MySQL database dynamically for future
analysis is one main theme of this project.
JSON, a lightweight data-interchange format, is
used to fetch information from data source
providers. User, environmental and traffic sensitive
information over a thematic timeline is retrieved
and stored.
User selects his timeline and searches for all the
evens that have occured in that particular timeline.
Map based visualization is facilitated whereby user
can delve around the screen to locate the
corresponding event pushpin and perform
historical data analysis on events which have
occurred in the past.
Server, which handles requests from the user to
fetch the values from the database, filters the
search according to user’s input and delivers the
required event in JSON format back to handle.
Javascript handles the data returned from the
server effectively and plots the event values on the
map. Latitude, Longitude and event description
are main fields which will plots the correct event
description on to pushpin.
Figure 10: System Architecture for Event Data Aggregation & Visualization
8. City Traffic Event Aggregation and Visualizing Service
Page 7
A script which continuosly stores the events from
API calls to the database executes as a cron job in
the server. This cron job collects information about
a particular event, parses the information and
stores the required parameters on to the database.
Front-End Description
In this Project, we have used many Java script
plug-ins such as JQuery UI, Raphael, Moment, and
Here API. User sends his request via AJAX calls to
the server, requesting for data. Server gives the
data requested in JSON format to the user. Push
pins and text are plotted based on the response
obtained from the server. Bootstrap css framework
is used extensively to adapt this website for mobile
devices. By this user interface, the user can easily
identify the events tagged in a location. Scrollable
timeline panel gives user the ease of seeing the
events based on time.
Back-End Description
Server fetches the data from the event provider via
API calls. Obtained JSON string is parsed and
updated in to the database. This sequence
happens as a cron job thereby updating the events
dynamically on a specified time interval. The
location and timestamp is set as primary key, these
two parameters distinguishes an event from all
other events. Recording Latitude and Longitude
data is very useful, for obtaining accurate
information about a location and for plotting the
event in the map. MYSQL is used as the database
to store all the active and scheduled events.
Implementation
HTML, CSS, JAVASCRIPT are three primary
languages used for developing visualization. We
have used PHP in apache server to process the
request from the clients and dynamically update
the database about current events.
User supplies AJAX request to the server
requesting for events which happened with in a
date range. Server searches for events which are
within the date range in the database. It returns
user’s request with the events recorded. The server
returns the events obtained to the user in a JSON
format. List of POPO (Plain Old PHP Objects)
objects retrieved from the database are converted
to json format. JSON response is sent back to the
user to visualize events.
Cron jobs are scheduled to retrieve the data from
API and insert in to the MYSQL database. Traffic
information and event information are
dynamically requested to the respective data
sources. Response from the data sources are
parsed from the PHP server and are updated in to
the database.
Below is the diagram of outline of system’s
implementation.
Figure 11: System Outline
9. City Traffic Event Aggregation and Visualizing Service
Page 8
Snapshots:
Figure 12: User Interface Snap Shot
Figure 13: SERVER (PHP) CODE
10. City Traffic Event Aggregation and Visualizing Service
Page 9
Conclusion:
We have implemented the proposed architecture
to store and visualize the events based on a
timeline. This collection of events, can be used to
predict future traffic based on a statistical analysis,
alert the individuals about a link which is crowded
and can be used to find root cause analysis for
events. As traffic congestion is a major problem in
crowded cities, this tool can be used as an effective
way to visualize ongoing event information and
take better decisions based on the same.
Instructions to execute
Back end:
1. Install XAMPP control panel tool by Apache for
using MySQL and Apache Server.
2. After installation, copy the PHP Folder
(containing PHP files) to <XAMPP
Directory>/htdocs/
3. Open your XAMPP Control Panel > Start
Apache & MySQL Servers.
4. You can now execute the PHP files from the
browser by typing
http://localhost/<PHPFolderName>. Remember
that this is the Folder which you had copied to
htdocs before.
Figure 14: Database Snapshot
11. City Traffic Event Aggregation and Visualizing Service
Page 10
Front end:
1. Copy the Client side folder to the same
XAMPP directory <XAMPP
Directory>/htdocs/
2. The Client Side User Interface is now
accessible in
http://localhost/<ClientFolderName>.
Keep in mind that this is the same folder
where you had copied and pasted both
your Client Side Folder and PHP Folder.
The directory location should be like the
above hierarchy.
References
1. http://research.microsoft.com/en-
us/projects/clearflow/default.aspx
2. http://www.fox10phoenix.com/story/278
42935/traffic-analytics-company-says-
average-bay-area-commute-times-are-
increasing
3. http://bayen.eecs.berkeley.edu/sites/defa
ult/files/conferences/Blandin_Salam_Baye
n_TRB12.pdf
4. http://venturebeat.com/2015/04/03/how
-microsofts-using-big-data-to-predict-
traffic-jams-up-to-an-hour-in-advance/
5. Horvitz, E. J., Apacible, J., Sarin, R., & Liao,
L. (2012). Prediction, expectation, and
surprise: Methods, designs, and study of a
deployed traffic forecasting service. arXiv
preprint arXiv:1207.1352.
6. http://www.d.umn.edu/cs/thesis/lalit_noo
kala_ms.pdf
7. http://www.balticroads.org/downloads/2
5BRC/25brc_d1_pakalnis_1.pdf
8. http://www.dot.state.mn.us/d3/projects/i
nterregionalconnection/pdfs/final/Chapt
er9.pdf
9. https://www.eventbrite.com/
10. http://lexington.eventful.com/events
11. https://developer.here.com/rest-apis
12. http://openweathermap.org/
12. City Traffic Event Aggregation and Visualizing Service
Page 11