Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
1. A Study on an Urban Safety Index Model
Using Public Big Data and SNS Data
Young-Won Seo, Hyun-Suk Hwang, Palash Sontakke, Shao Xiaorui,
Tea-Gun Jeon, Tea-Yeon Kim, Chang-Soo Kim
ICONI 2017
2. ICONI 2017
02
03
04
1. Background
2. Approach & Goal
3. Related Work
4. Safety Index Model
1) Data Preprocessing
2) Model
3) System Design
5. Conclusion & Future Work
01
05
Index
A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
3. Background
Recent Disaster and Accident Issues in Korea
Avian influenza
Traffic accident
Regional Safety Index
Deprecated
Not reflect characteristics of regional feature
02
03
04
01
05
Heavy rain and earthquake
Release once a year
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
4. Goal & Approach
Analysis public data set
Include feature of locality
Release as file and API(OpenAPI)
System design of disaster monitoring system for public data and
SNS Data
Interactive disaster monitor system for real-time state
Push real-time to user using notification
02
03
04
01
05
open data set of Korea government
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
5. Related Works
Regional Safety Index
Measure of the level for
safety at regional levels
Social Big Board
Classification by disaster type
Work with Smart Big Board
Seven fields and 5 level
( level1 – weakest )
Real-time SNS Collect and filtering
Private System
02
03
04
01
05
Regional Safety Index
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
6. Workflow
Workflow of an integrated model
02
03
04
01
05
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
7. Urban Safety Index – Data Preprocessing
Public Data Set
Type Organization Attribute
Weather Korea Meteorological
Administration
temperature(max,min,avg),
Precipitation,
Wind speed(max,avg),
Humidity, Measuring point
Air environment Korea Environment Corporation SO2, CO, O3, NO2, Find dust(PM10, PM25),
Measuring point
Traffic accident Traffic Accident Analysis System Datetime(year,month,day,hour), Address, Lat, lng,
Count(Dead, Injury)
Infection Korea Centers for Disease Control
and Prevention
Address, Count(enterohemorrhagic,
heterogeneity, typhoid, paratyphoid, cholera,
hepatitis, …)
Fire National Fire Data System Address, Property Damage,
Count(Injury, Fire, Caualities)
Crime National Police Agency Address, Count(violence, intelligence, theft,
transporation, …)
… … …
02
04
01
05
03
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
8. Urban Safety Index – Data Preprocessing
02
04
01
05
Merge as location and time
1. Standardize Coordinate
system
2. Calculate data by
format/period
3. Merge locality code
- Area , address -> WGS84(lat,lng)
- UTMK -> WGS84(lat,lng)
4. Merge by nearest point
(latitude, longitude)
- Weather measuring point: ASOS, Disaster
- Air pollution measuring point
- Average of Year, month, day
- Summation of year, month
- Korea Legal locality code
- For matching with layout of map
03
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
9. Urban Safety Index - Model
Model evaluation
– Statistic public data with weather and air environment
02
04
01
05
Type Related Attribute
Logistic
regression
Decision
Tree
SVM
Neural
Network
Traffic
accident
temperature,
find dust, humidity,
wind direction, wind
speed
34% 34% 45% 40%
Fire
temperature, wind
direction, humidity,
wind direction
37% 38% 42% 33%
Suicide
maximum temperature,
find dust, wind speed,
humidity
35% 43% 40% 29%
Infection
minimum/maximum
temperature, wind
speed, find dust,
humidity
31% 36% 39% 29%
Safety
accident
temperature,
precipitation, find dust,
humidity, wind speed
30% 23% 36% 28%
03
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
10. Urban Safety Index - Model
Model evaluation
– Detailed traffic accident data with weather and air environment
02
04
01
05
Algorithm Accuracy
Logistic regression 45%
Decision Tree 42%
SVM 40%
Neural Network 53%
03
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
11. Urban Satefy Index - SNS Data
Collect and preprocessing
02
04
01
05
1. Collect Tweet using Twitter
stream API
3. Send to client
(Browser/Mobile App)
4. Store to database
- Filtering by keyword of disaster
ex) fire, traffic accident
- Include lat,lng
- Using CNN for sentence classification
- Problem of Korean language for NLP
- Using test data and just count
- Send to view using web socket
- Store in json format
2. Classification disaster
type/positive or negative
03
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
12. ReactJS
(Docks/Redux)
System Design - Architecture
RESTful (OAuth)
Client
ncigmap
…
Middleware
Tweet Collector
Twitter Stream
API
Twitter
Analyzer(NLP)
echart
Server
Google
map
Router
Logger
Thunk
MySQ
L
Monolitic, MVC
Database
SNS
Public Data
Weather
Transportation
Fire
MongoDB / Redis
Public Data
Collector
Wrapper API
Auth
Favorite
Location
View Static
User
Admin
02
03
04
01
05
Location
…
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
13. System Design - App
02
03
01
05
04
Main Safety Index Menu
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
14. System Design - App
02
03
01
05
04
Safety Index Traffic Accident Fire Tweet
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
15. Conclusion & Future Work
Conclusion Future Work
Improving classification
performance with
granular data
Calculated the correlation
between the weather and
various disaster data of
public data set.
02
03
04
01
05
Implements disaster
monitoring system using
public data set.
Collect public data
and preprocessing for
analysis.
Processing SNS Data
using NLP
Using crawling data on the portal
or government system with API
ICONI 2017A Study on an Urban Safety Index Model Using Public Big Data and SNS Data
Hi, My name is youngwon, seo, I work at nc infotech in korea, busan. I am glad to meet you.
today, I’m going to present about Urban safety index model using public data and building disaster monitoring system that title is “A study on an urban safety index model using public big data and sns data.
Looking at the table of contents, I’m going to talk the background, approach and goal, related work, safety index model, and finally conclusion and future work.
In recent years, there have been various issues of disasters in Korea. Typically, natural disasters such as floods and earthquakes have caused unpredictable and unexpectable, and it has the potential to cause human casualties.
also, infection such as avian influenza have become an issue whenever they happen, but they tend to be less responsive or lack attention if they are not.
Therefore, the South Korean government created a safety index model in 2014, using Statistical data of disasters and accident. and it is called regional safety index.
It has been ranked according to the types of disasters occurring in the region, so that each local government can identify and manage state of safe within the there region.
However, there is also some evidence that the recent issue of the recent issue of the disaster is difficult to deal with. The reason is that regional safety indices are designated as accident statistics based on the basis of the accident statistics, and therefore, they do not reflect the reflection and reflection of the sudden disaster.
For example, it was based on statistical data of accident, Except for the classification between rural and urban areas,
The dangers of traffic accidents and fires have resulted in safer rural villages than cities.
so, in this paper,
presents a prototype safety model based on data that include regional characteristics,
and a visualization system for this model.
This study uses public data set that contains regional charactuerstics.
Public data has been disclosed for the purpose of ensuring access to public data by data provided by each governmental agency, and for the creation of public data through private utilization.
After this, it presents a system design for model visualization.
It provide interactive visualization and real-time notification,
As related studies, there are regional safety index and social big board.
The regional safety index are quantified to the level of safety of local governments, and it use related statistical data.
Risk indicators, weak indications, mitigation indicators, and calculated according to the given formula. Areas of natural disaster use local safety diagnosis results.
The index is divided into five categories and provides a safety index for seven areas including fire, traffic, and crime.
However, we have been released once a year and have been unable to reflect on the reflection and reflection of real time.
The Social Big Board is a monitoring system that leverages the data from the government's social networks to cope with real-time situations rather than just a single year index.
This is a system that enables real-time data processing, such as collecting disaster related data from SNS and categorizing disaster types. It is associated with a traditional disaster monitoring system, a traditional disaster monitoring system, and can only be used by public institutions.
The entire process of this thesis and future research is as follows :
Data collection and pre-processing, second model generation, third system development and model application stages exist.
First, data preprosesing step.
We have searched various data to create a model.
We collected data on weather, traffic accidents, fire, crime, and population. For develop the prototype model, I used the following data.
After collecting the data, proceed to merge the related data pairs.
First, unify the coordinate format. Actual data has various coordinate formats, and coordinates may not exist, so we converted it to wgs84 latitude and longitude using software or location api.
Second, proceed with calculations as appropriate for the unit and period. It is a task to calculate the number of days data or to match the timestamp of the data.
Third, match the Korean code for visualization because the code for the Korean map is made up of the corresponding code.
Fourth, the merge of the data, that is, the distinction to the data generated at the closest proximity and nearest location.
Next, we apply the following algorithms to evaluate the preprocessed data.
The following is an evaluation of the year’s data, and It it divided by gu layer in city.
I found and evaluated the associated properties, but I could not expect high performance.
The reason is that it is only preprocessing, and the accuracy and the number of the data are few and it has NA.
The following is a model for evaluating the relationship between weather and detailed traffic accident data.
This data has more detailed information than other data, but it also has 50% accuracy because of the small amount of data.
In this study, SNS was used for real-time model.
Use the Twitter streaming API to get relevant tweets and visualize them.
To get a tweet, we filter through disaster-related keywords and analyze the disaster type or positive denial.
All of this depends almost entirely on natural language processing, which is not discussed in detail here.
I tried to visualize the map using only test data.
It is System design for visualization.
It focuses on scalability and is largely divided into clients and servers.
The client used react and the server was express and spring based.
It acts asynchronously by creating a restapi for accessing the data to interactively load the data.
To increase the data and scalability of the system, nosql, which is easy to shard, was used.
The screen for the results of the system development.
The client for the prototypes was developed as a mobile version. You can monitor the information about the layers and public data for each local region of the region.
It is View of Safety Index, Traffic Accident, Fire and Tweet.
Here's the final conclusion.
In this study, public data were first collected and pre-processed for the visualization of safety indexes and safety index models.
Preprocessing has merged data into the correct location and time and processed the abnormal values.
Later, several algorithms were created to generate and evaluate models and develop a visualization system.
The most important thing in this study is the quality and quantity of data. I tried to build a prototype of the relationship between the weather and transportation, but it didn't perform well.
In further studies, the system development is intended to address the performance improvement of models and SNS through the collection of better quality data.
Here's the final conclusion.
In this study, public data were first collected and pre-processed for the visualization of safety indexes and safety index models.
Preprocessing has merged data into the correct location and time and processed the abnormal values.
Later, several algorithms were created to generate and evaluate models and develop a visualization system.
The most important thing in this study is the quality and quantity of data. I tried to build a prototype of the relationship between the weather and transportation, but it didn't perform well.
In further studies, the system development is intended to address the performance improvement of models and SNS through the collection of better quality data.