This document discusses using sentiment analysis and geographic analysis to enhance security by predicting crime rates. It proposes collecting tweets from areas known for high crime and analyzing the sentiment to identify "red alert" neighborhoods in real-time. It also involves analyzing historical crime data using kernel density estimation to identify long-term crime hotspots. By combining real-time tweet sentiment analysis with historical crime patterns, the goal is to more accurately predict crime occurrences and notify citizens. The paper reviews literature on using machine learning techniques like sentiment analysis and kernel density estimation for crime prediction and monitoring social media for risky situations.
Dataset from National Institute of justice about the crimes of San Francisco. Apply Network Analysis after calculation of distances between different crime points as nodes of a city and then put the approach.
Event detection and summarization based on social networks and semantic query...ijnlc
Events can be characterized by a set of descriptive, collocated keywords extracted documents. Intuitively,
documents describing the same event will contain similar sets of keywords, and the graph for a document collection will contain clusters individual events. Helping users to understand the event is an acute problem nowadays as the users are struggling to keep up with tremendous amount of information published every day in the Internet. The challenging task is to detect the events from online web resources, it is getting more attentions. The important data source for event detection is a Web search log because the information it contains reflects users’ activities and interestingness to various real world events. There are three major issues playing role for event detection from web search logs: effectiveness, efficiency of
detected events. We focus on modeling the content of events by their semantic relations with other events
and generating structured summarization. Event mining is a useful way to understand computer system behaviors. The focus of recent works on event mining has been shifted to event summarization from discovering frequent patterns. Event summarization provides a comprehensible explanation of the event sequence based on certain aspects.
A predictive model for mapping crime using big data analyticseSAT Journals
Abstract Crime reduction and prevention challenges in today’s world are becoming increasingly complex and are in need of a new technique that can handle the vast amount of information that is being generated. Traditional police capabilities mostly fall short in depicting the original division of criminal activities, thus contribute less in the suitable allocation of police services. In this paper methods are described for crime event forecasting, using Hadoop, by studying the geographical areas which are at greater risk and outside the traditional policing limits. The developed method makes the use of a geographical crime mapping algorithm to identify areas that have relatively high cases of crime. The term used for such places is hot spots. The identified hotspot clusters give valuable data that can be used to train the artificial neural network which further can model the trends of crime. The artificial neural network specification and estimation approach is enhanced by processing capability of Hadoop platform. Keywords— Crime forecasting; Cluster analysis; artificial neural networks; Patrolling; Big data; Hadoop; Gamma test.
Dataset from National Institute of justice about the crimes of San Francisco. Apply Network Analysis after calculation of distances between different crime points as nodes of a city and then put the approach.
Event detection and summarization based on social networks and semantic query...ijnlc
Events can be characterized by a set of descriptive, collocated keywords extracted documents. Intuitively,
documents describing the same event will contain similar sets of keywords, and the graph for a document collection will contain clusters individual events. Helping users to understand the event is an acute problem nowadays as the users are struggling to keep up with tremendous amount of information published every day in the Internet. The challenging task is to detect the events from online web resources, it is getting more attentions. The important data source for event detection is a Web search log because the information it contains reflects users’ activities and interestingness to various real world events. There are three major issues playing role for event detection from web search logs: effectiveness, efficiency of
detected events. We focus on modeling the content of events by their semantic relations with other events
and generating structured summarization. Event mining is a useful way to understand computer system behaviors. The focus of recent works on event mining has been shifted to event summarization from discovering frequent patterns. Event summarization provides a comprehensible explanation of the event sequence based on certain aspects.
A predictive model for mapping crime using big data analyticseSAT Journals
Abstract Crime reduction and prevention challenges in today’s world are becoming increasingly complex and are in need of a new technique that can handle the vast amount of information that is being generated. Traditional police capabilities mostly fall short in depicting the original division of criminal activities, thus contribute less in the suitable allocation of police services. In this paper methods are described for crime event forecasting, using Hadoop, by studying the geographical areas which are at greater risk and outside the traditional policing limits. The developed method makes the use of a geographical crime mapping algorithm to identify areas that have relatively high cases of crime. The term used for such places is hot spots. The identified hotspot clusters give valuable data that can be used to train the artificial neural network which further can model the trends of crime. The artificial neural network specification and estimation approach is enhanced by processing capability of Hadoop platform. Keywords— Crime forecasting; Cluster analysis; artificial neural networks; Patrolling; Big data; Hadoop; Gamma test.
Abstract : Crime prediction is a topic of significant research across the fields of criminology, data mining, city planning, law enforcement, and political science. Crime patterns exist on a spatial level; these patterns can be grouped geographically by physical location, and analyzed contextually based on the region
in which crime occurs. This paper proposes a mechanism to parameterize street-level crime, localize crime hotspots, identify correlations between spatiotemporal crime patterns and social trends, and analyze the resulting data for the purposes of knowledge discovery and anomaly detection. The subject of this study is the county of Merseyside in the United Kingdom, over a span of 21 months beginning in December 2010 (monthly) through August 2012. Several types of crime are analyzed in this dataset, including Burglary and Antisocial Behavior. Through this analysis, several interesting findings are drawn about crime in Merseyside, including: hotspots with steadily increasing crime levels, hotspots with unstable crime levels, synchronous changes in crime trends throughout Merseyside as a whole, individual months in which certain hotspots behaved anomalously, and a strong correlation between crime hotspot locations and borough/postal code locations. We believe that this type of statistical and correlative analysis of crime patterns will help law enforcement agencies predict criminal activity, allocate resources, and promote community awareness to reduce overall crime rates.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
How to Calculate the Public Psychological Pressure in the Social NetworksTELKOMNIKA JOURNAL
With the worldwide application of social networks, new mathematical approaches have been developed that quantitatively address this online trend, including the concept of social computing. The analysis of data generated by social networks has become a new field of research; social conflicts on social networks occur frequently on the internet, and data regarding social behavior on social networks must be analyzed objectively. In this paper a type of social compouting method based on the principle of maximum entropyis proposed, and this type of social computing method can solve a series of complex social computing problems including the calculation of public psychological pressure. The quantitative calculation of public psychological pressure is so important to the public opinion analysis that it can be widely applied in a lot of public information analysis fields.
First study on a complete dataset of Tweet
Speech presented during the 4th edition of Transforming Audiences Conference, University of Westminister - 3 Septermber 2013
BREAKING MIGNOTTE’S SEQUENCE BASED SECRET SHARING SCHEME USING SMT SOLVERijcsit
The secret sharing schemes are the important tools in cryptography that are used as building blocks in
many secured protocols. It is a method used for distributing a secret among the participants in a manner
that only the threshold number of participants together can recover the secret and the remaining set of
participants cannot get any information about the secret. Secret sharing schemes are absolute for storing
highly sensitive and important information. In a secret sharing scheme, a secret is divided into several
shares. These shares are then distributed to the participants’ one each and thus only the threshold (t)
number of participants can recover the secret. In this paper we have used Mignotte’s Sequence based
Secret Sharing for distribution of shares to the participants. A (k, m) Mignotte's sequence is a sequence of
pair wise co-prime positive integers. We have proposed a new method for reconstruction of secret even
with t-1 shares using the SMT solver.
Crime Data Analysis, Visualization and Prediction using Data MiningAnavadya Shibu
This paper presents a general idea about the
model of Data Mining techniques and diverse crimes. It also
provides an inclusive survey of competent and valuable
techniques on data mining for crime data analysis. The
objective of the data mining is to recognize patterns in
criminal manners in order to predict crime anticipate
criminal activity and prevent it. This project implements a
novel data mining techniques like KNN, Text Clustering, IR
tree for investigating the crime data sets and sorts out the
accessible problems. The collective knowledge of various
data mining algorithms tend certainly to afford an
enhanced, incorporated, and precise result over the crime
prediction in the banking sectors Our law enforcement
organizations require to be adequately outfitted to defeat
and prevent the crime. This project is developed using Java
as front-end and MySQL as back-end. Supporting
applications like Sunset, NetBeans are used to make the
portal more interactive.
A Survey on Data Mining Techniques for Crime Hotspots PredictionIJSRD
A crime is an act which is against the laws of a country or region. The technique which is used to find areas on a map which have high crime intensity is known as crime hotspot prediction. The technique uses the crime data which includes the area with crime rate and predict the future location with high crime intensity. The motivation of crime hotspot prediction is to raise people’s awareness regarding the dangerous location in certain time period. It can help for police resource allocation for creating a safe environment. The paper presents survey of different types of data mining techniques for crime hotspots prediction.
Event detection in twitter using text and image fusioncsandit
In this paper, we describe an accurate and effective event detection method to detect events from
Twitter stream. It detects events using visual information as well as textual information to improve
the performance of the mining. It monitors Twitter stream to pick up tweets having texts and photos
and stores them into database. Then it applies mining algorithm to detect the event. Firstly, it detects
event based on text only by using the feature of the bag-of-words which is calculated using the term
frequency-inverse document frequency (TF-IDF) method. Secondly, it detects the event based on
image only by using visual features including histogram of oriented gradients (HOG) descriptors,
grey-level co-occurrence matrix (GLCM), and color histogram. K nearest neighbours (Knn)
classification is used in the detection. Finally, the final decision of the event detection is made based
on the reliabilities of text only detection and image only detection. The experiment result showed that
the proposed method achieved high accuracy of 0.93, comparing with 0.89 with texts only, and 0.86
with images only.
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...IJARIIT
Criminalization may be a general development that has significantly extended in previous few years. In
order, to create the activity of the work businesses easy, use of technology is important. Crime investigation analysis
is a section records in data mining plays a crucial role in terms of predicting and learning the criminals. In our
paper, we've got planned an incorporated version for physical crime as well as cybercrime analysis. Our approach
uses data mining techniques for crime detection and criminal identity for physical crimes and digitized forensic tools
(DFT) for evaluating cybercrimes. The presented tool named as Comparative Digital Forensic Process tool
(CDFPT) is entirely based on digital forensic model and its stages named as Comparative Digital Forensic Process
Model (CDFPM). The primary step includes accepting the case details, categorizing the crime case as physical crime
or cybercrime and sooner or later storing the data in particular databases. For physical crime analysis we've used kmeans
approach cluster set of rules to make crime clusters. The k-means method effects are a lot advantageous by the
utilization of GMAPI generation. This provides advanced and consumer-friendly visual-aid to k-means approach for
tracing the region of the crime. we have applied KNN for criminal identification with the
help of observing beyond crimes and finding similar ones that suit this crime, if no past document is discovered then
the new crime sample are introduced to the crime data-set. With the advancements of web, the network form has
become much more complicated and attacking methods are further more than that as well. For crime analysis
we're detecting the attacks executed on host system through an outsider the usage of
assorted digitized forensic tools to produce information security with the help of generating reports for an
event which could need any investigation. Our digitized technique aids the development of the society
by helping the investigation businesses to follow a custom-built investigative technique in crime analysis and criminal
identification as opposed to manually looking the database to analyze criminal activities, and as a
result facilitate them in combating crimes.
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
Smart cities utilize Internet of Things (IoT) devices and sensors to enhance the quality of the city
services including energy, transportation, health, and much more. They generate massive
volumes of structured and unstructured data on a daily basis. Also, social networks, such as
Twitter, Facebook, and Google+, are becoming a new source of real-time information in smart
cities. Social network users are acting as social sensors. These datasets so large and complex
are difficult to manage with conventional data management tools and methods. To become
valuable, this massive amount of data, known as 'big data,' needs to be processed and
comprehended to hold the promise of supporting a broad range of urban and smart cities
functions, including among others transportation, water, and energy consumption, pollution
surveillance, and smart city governance. In this work, we investigate how social media analytics
help to analyze smart city data collected from various social media sources, such as Twitter and
Facebook, to detect various events taking place in a smart city and identify the importance of
events and concerns of citizens regarding some events. A case scenario analyses the opinions of
users concerning the traffic in three largest cities in the UAE
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTELKOMNIKA JOURNAL
As people freely express their opinions toward a product on Twitter streams without being bound
by time, visualizing time pattern of customers emotional behavior can play a crucial role in decisionmaking.
We analyze how emotions are fluctuated in pattern and demonstrate how we can explore it into
useful visualizations with an appropriate framework. We manually customized the current framework in
order to improve a state-of-the-art of crawling and visualizing Twitter data. The data, post or update on
status on the Twitter website about iPhone, was collected from U.S.A, Japan, Indonesia, and Taiwan by
using geographical bounding-box and visualized it into two-dimensional heat map, interactive stream
graph, and context focus via brushing visualization. The results show that our proposed system can
explore uniqueness of temporal pattern of customers emotional behavior.
Abstract : Crime prediction is a topic of significant research across the fields of criminology, data mining, city planning, law enforcement, and political science. Crime patterns exist on a spatial level; these patterns can be grouped geographically by physical location, and analyzed contextually based on the region
in which crime occurs. This paper proposes a mechanism to parameterize street-level crime, localize crime hotspots, identify correlations between spatiotemporal crime patterns and social trends, and analyze the resulting data for the purposes of knowledge discovery and anomaly detection. The subject of this study is the county of Merseyside in the United Kingdom, over a span of 21 months beginning in December 2010 (monthly) through August 2012. Several types of crime are analyzed in this dataset, including Burglary and Antisocial Behavior. Through this analysis, several interesting findings are drawn about crime in Merseyside, including: hotspots with steadily increasing crime levels, hotspots with unstable crime levels, synchronous changes in crime trends throughout Merseyside as a whole, individual months in which certain hotspots behaved anomalously, and a strong correlation between crime hotspot locations and borough/postal code locations. We believe that this type of statistical and correlative analysis of crime patterns will help law enforcement agencies predict criminal activity, allocate resources, and promote community awareness to reduce overall crime rates.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
How to Calculate the Public Psychological Pressure in the Social NetworksTELKOMNIKA JOURNAL
With the worldwide application of social networks, new mathematical approaches have been developed that quantitatively address this online trend, including the concept of social computing. The analysis of data generated by social networks has become a new field of research; social conflicts on social networks occur frequently on the internet, and data regarding social behavior on social networks must be analyzed objectively. In this paper a type of social compouting method based on the principle of maximum entropyis proposed, and this type of social computing method can solve a series of complex social computing problems including the calculation of public psychological pressure. The quantitative calculation of public psychological pressure is so important to the public opinion analysis that it can be widely applied in a lot of public information analysis fields.
First study on a complete dataset of Tweet
Speech presented during the 4th edition of Transforming Audiences Conference, University of Westminister - 3 Septermber 2013
BREAKING MIGNOTTE’S SEQUENCE BASED SECRET SHARING SCHEME USING SMT SOLVERijcsit
The secret sharing schemes are the important tools in cryptography that are used as building blocks in
many secured protocols. It is a method used for distributing a secret among the participants in a manner
that only the threshold number of participants together can recover the secret and the remaining set of
participants cannot get any information about the secret. Secret sharing schemes are absolute for storing
highly sensitive and important information. In a secret sharing scheme, a secret is divided into several
shares. These shares are then distributed to the participants’ one each and thus only the threshold (t)
number of participants can recover the secret. In this paper we have used Mignotte’s Sequence based
Secret Sharing for distribution of shares to the participants. A (k, m) Mignotte's sequence is a sequence of
pair wise co-prime positive integers. We have proposed a new method for reconstruction of secret even
with t-1 shares using the SMT solver.
Crime Data Analysis, Visualization and Prediction using Data MiningAnavadya Shibu
This paper presents a general idea about the
model of Data Mining techniques and diverse crimes. It also
provides an inclusive survey of competent and valuable
techniques on data mining for crime data analysis. The
objective of the data mining is to recognize patterns in
criminal manners in order to predict crime anticipate
criminal activity and prevent it. This project implements a
novel data mining techniques like KNN, Text Clustering, IR
tree for investigating the crime data sets and sorts out the
accessible problems. The collective knowledge of various
data mining algorithms tend certainly to afford an
enhanced, incorporated, and precise result over the crime
prediction in the banking sectors Our law enforcement
organizations require to be adequately outfitted to defeat
and prevent the crime. This project is developed using Java
as front-end and MySQL as back-end. Supporting
applications like Sunset, NetBeans are used to make the
portal more interactive.
A Survey on Data Mining Techniques for Crime Hotspots PredictionIJSRD
A crime is an act which is against the laws of a country or region. The technique which is used to find areas on a map which have high crime intensity is known as crime hotspot prediction. The technique uses the crime data which includes the area with crime rate and predict the future location with high crime intensity. The motivation of crime hotspot prediction is to raise people’s awareness regarding the dangerous location in certain time period. It can help for police resource allocation for creating a safe environment. The paper presents survey of different types of data mining techniques for crime hotspots prediction.
Event detection in twitter using text and image fusioncsandit
In this paper, we describe an accurate and effective event detection method to detect events from
Twitter stream. It detects events using visual information as well as textual information to improve
the performance of the mining. It monitors Twitter stream to pick up tweets having texts and photos
and stores them into database. Then it applies mining algorithm to detect the event. Firstly, it detects
event based on text only by using the feature of the bag-of-words which is calculated using the term
frequency-inverse document frequency (TF-IDF) method. Secondly, it detects the event based on
image only by using visual features including histogram of oriented gradients (HOG) descriptors,
grey-level co-occurrence matrix (GLCM), and color histogram. K nearest neighbours (Knn)
classification is used in the detection. Finally, the final decision of the event detection is made based
on the reliabilities of text only detection and image only detection. The experiment result showed that
the proposed method achieved high accuracy of 0.93, comparing with 0.89 with texts only, and 0.86
with images only.
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...IJARIIT
Criminalization may be a general development that has significantly extended in previous few years. In
order, to create the activity of the work businesses easy, use of technology is important. Crime investigation analysis
is a section records in data mining plays a crucial role in terms of predicting and learning the criminals. In our
paper, we've got planned an incorporated version for physical crime as well as cybercrime analysis. Our approach
uses data mining techniques for crime detection and criminal identity for physical crimes and digitized forensic tools
(DFT) for evaluating cybercrimes. The presented tool named as Comparative Digital Forensic Process tool
(CDFPT) is entirely based on digital forensic model and its stages named as Comparative Digital Forensic Process
Model (CDFPM). The primary step includes accepting the case details, categorizing the crime case as physical crime
or cybercrime and sooner or later storing the data in particular databases. For physical crime analysis we've used kmeans
approach cluster set of rules to make crime clusters. The k-means method effects are a lot advantageous by the
utilization of GMAPI generation. This provides advanced and consumer-friendly visual-aid to k-means approach for
tracing the region of the crime. we have applied KNN for criminal identification with the
help of observing beyond crimes and finding similar ones that suit this crime, if no past document is discovered then
the new crime sample are introduced to the crime data-set. With the advancements of web, the network form has
become much more complicated and attacking methods are further more than that as well. For crime analysis
we're detecting the attacks executed on host system through an outsider the usage of
assorted digitized forensic tools to produce information security with the help of generating reports for an
event which could need any investigation. Our digitized technique aids the development of the society
by helping the investigation businesses to follow a custom-built investigative technique in crime analysis and criminal
identification as opposed to manually looking the database to analyze criminal activities, and as a
result facilitate them in combating crimes.
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
Smart cities utilize Internet of Things (IoT) devices and sensors to enhance the quality of the city
services including energy, transportation, health, and much more. They generate massive
volumes of structured and unstructured data on a daily basis. Also, social networks, such as
Twitter, Facebook, and Google+, are becoming a new source of real-time information in smart
cities. Social network users are acting as social sensors. These datasets so large and complex
are difficult to manage with conventional data management tools and methods. To become
valuable, this massive amount of data, known as 'big data,' needs to be processed and
comprehended to hold the promise of supporting a broad range of urban and smart cities
functions, including among others transportation, water, and energy consumption, pollution
surveillance, and smart city governance. In this work, we investigate how social media analytics
help to analyze smart city data collected from various social media sources, such as Twitter and
Facebook, to detect various events taking place in a smart city and identify the importance of
events and concerns of citizens regarding some events. A case scenario analyses the opinions of
users concerning the traffic in three largest cities in the UAE
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTELKOMNIKA JOURNAL
As people freely express their opinions toward a product on Twitter streams without being bound
by time, visualizing time pattern of customers emotional behavior can play a crucial role in decisionmaking.
We analyze how emotions are fluctuated in pattern and demonstrate how we can explore it into
useful visualizations with an appropriate framework. We manually customized the current framework in
order to improve a state-of-the-art of crawling and visualizing Twitter data. The data, post or update on
status on the Twitter website about iPhone, was collected from U.S.A, Japan, Indonesia, and Taiwan by
using geographical bounding-box and visualized it into two-dimensional heat map, interactive stream
graph, and context focus via brushing visualization. The results show that our proposed system can
explore uniqueness of temporal pattern of customers emotional behavior.
As per studies conducted by the University of California, it is observed that crime in any area follows the same pattern as that of earthquake aftershocks. It is difficult to predict an earthquake, but once it happens the aftershocks following it are quite predictable. Same is true for the crimes happening in a geographical area.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
SENTIMENT ANALYSIS AND GEOGRAPHICAL ANALYSIS FOR ENHANCING SECURITY
1. SENTIMENT ANALYSIS AND GEOGRAPHICAL ANALYSIS FOR ENHANCING SECURITY
Namratha M1
, R Hariharan2
, Nimmi K3
, J SangeethaPriya4
1,2,3,4
Assistant Professor, 1
Department of CSE, BMS College of Engineering,
Bangalore,
2
Department of Information Technology,
Vel Tech Rangarajan Dr Sagunthala R&D Institute of Science and Technology,
3
Department of CSE, SCMS School of Engineering and Technology,
4
Department of IT, Saranathan college of Engineering ,
1
namratha.july@gmail.com, 2
hharanbtech@gmail.com, 3
menimmicse@scmsgroup.org
4
priyamail.kumar@gmail.com
Abstract: Enhancing security around our general public
has been the major challenging issue .This paper is a
point by point study about the endeavours that have been
made to enhance the forecast of the crime intensity
utilizing machine learning and big data to help the
citizens to remain in a sheltered zone. In this manner, the
sort of study discussed here will help reveal the crime
rate of an area progressively. This paper focuses on both
the setimental analysis as well as geogrpahic analysis of
an area to give more accurate results regarding the type
of the crime in a particular area.
Keywords: Machine learning, Sentiment Analysis,
Tweets, KDE(Kernel Density Estimation)
1. Introduction
Crime is a forbidden act that is punishable by authorities
.it can be any social nuisance which cause defined as an
act harmful not only to the individual and also to the
community ,society etc.
The issues of the crime are growing in certain cities
of the world. So, there is a need for some technology
which can predict the occurrences of crime which means
automated crime detection is much needed in the society.
This will help the police departments in great extent.
The study and analysis of crime is done using
computer technology, legal strategies along with tactics
targeting what type of crime ,for example murder,
robbery etc, leads to the prediction of digital crime,
digital terrorism, an information warfare.
The technology used here is geographical analysis[1]
along with sentimental analysis of online social
networking sites which help in the detection of crime
patterns. Twitter is one of the online social networking
and micro-blogging feature that enables users to post
short description referred to as "tweets". These updates
may convey important information. Here a system is
designed which extract tweets from cities that are
regarded to be dangerous cities in the country.
A geographic analysis and the sentiment
analysis[2][3] of data uncovered a connection between
tweets and crimes that occurred in the corresponding
cities. Sentiment analysis techniques were implemented
on these tweets to analyze the intensities of crime in
particular location. These processes need to be identified
and their performance can then be improved using data
from previous executions.
2. Literature Survey
Nowadays with the help of statistical learning techniques,
computers can learn and identify the patterns in the given
data. The most common applications uses of machine
learning are Data Mining, Natural Language Processing
(NLP), Image and Speech Recognition, Medical
diagnosis, finance, transportation, retail and social media
services industry. It is being set as the pillar of future
civilization.[4][5][6]
Twitter as of now serves roughly 140 million
worldwide users posting a consolidation 340 million
messages (or tweets) per day. Bollen Stated that, "a
tweet is a microscopic, temporally-authentic instantiation
of sentiment". Since tweets are brief, people in general
opinion can be effectively investigated. Twitter likewise
provides the feature of Re-tweet (RT), which allows
users to give comments of comments. Large scale
analysis of tweets might provide insights that are not
apparent within a single field. Thus we need to channel in
lights on our needs.
Sentiment Analysis is the process of
‘computationally’ determining whether a piece of writing
is positive, negative or neutral. It’s also known
as ‘opinion mining’, finding the attitude or opinion of the
International Journal of Pure and Applied Mathematics
Volume 118 No. 22 2018, 1777-1781
ISSN: 1314-3395 (on-line version)
url: http://acadpubl.eu/hub
Special Issue
ijpam.eu
1777
2. user. Sentiment analysis[7] also is used to monitor and
analyse social phenomena, for the spotting of possibly
risky circumstances. The quick development of online
networking has built the interest in sentimen
forms of online expressions (e.g., opinions
ratings, and recommendations) have turned out to be a
significant source of information to the organisation to
deal with their reputations. The challenge of detecting
crime patterns lies in geographically analyzing the hot
spot areas[8][9] in the city and then performing sentiment
analysis on the tweets of the city to find red
in real-time. The geographical analysis is done using the
static data source and sentimental analysis
real-time tweets. This study helps use in predicting the
crime using both real-time and static data sources more
precisely and accurately.
In this study we do the following:
1. Collect the data for geographical analysis from
various open sources. Use the data collected and analyse
it using the KDE[10][11] (Kernel Density Estimation)
which gives crime intensity distribution and helps us
determine hot spot areas. Based on crime intensities we
plot graph, explained in the Related Works in the
session.
Figure 1. flow diagram of system
2. Collect the real-time data from twitter API. Analyze
textual content in Twitter data by using sentiment
analysis to obtain the polarity of tweets
trends in different neighbourhoods. Tweets are collected
and filtered based on hot-spot areas (output of KDE).
Analysis is carried out to get the sentiment of the tweet.
We categorize them into positive, negative, very positive,
very negative and neutral. Besides sentiment polarity, we
were also interested in the trend of sentiment within each
neighbourhood. More detailed description of sentimental
analysis is explained in related work in next session.
3. For the prediction, we first calculate the estimated
crime density of each neighbourhood on the certain
period by using the KDE method mentioned above. Then
we process tweets on the that period to calculate the
polarity score and trend index. The logistic regression
Sentiment analysis[7] also is used to monitor and
analyse social phenomena, for the spotting of possibly
risky circumstances. The quick development of online
networking has built the interest in sentiment. Various
forms of online expressions (e.g., opinions-like reviews,
ratings, and recommendations) have turned out to be a
significant source of information to the organisation to
deal with their reputations. The challenge of detecting
in geographically analyzing the hot
spot areas[8][9] in the city and then performing sentiment
analysis on the tweets of the city to find red-alerts areas
time. The geographical analysis is done using the
static data source and sentimental analysis is done with
time tweets. This study helps use in predicting the
time and static data sources more
1. Collect the data for geographical analysis from
ources. Use the data collected and analyse
it using the KDE[10][11] (Kernel Density Estimation)
which gives crime intensity distribution and helps us
determine hot spot areas. Based on crime intensities we
plot graph, explained in the Related Works in the next
flow diagram of system
time data from twitter API. Analyze
textual content in Twitter data by using sentiment
[12] and their
Tweets are collected
spot areas (output of KDE).
Analysis is carried out to get the sentiment of the tweet.
We categorize them into positive, negative, very positive,
very negative and neutral. Besides sentiment polarity, we
also interested in the trend of sentiment within each
neighbourhood. More detailed description of sentimental
analysis is explained in related work in next session.
3. For the prediction, we first calculate the estimated
on the certain
period by using the KDE method mentioned above. Then
we process tweets on the that period to calculate the
polarity score and trend index. The logistic regression
model uses estimated density and twitter features to
predict the likelihood of incidents occurring on that
period in its neighbourhood. The graph plotted thus
shows the intensity of each crime in the city and hence
helps notifying the citizens about the appropriate
measures to be taken.[13][14]
3. Related Work
Here, techniques to be used are included and are
following:
i) Kernel Density Estimation (KDE):An approach to
crime prediction is hotspot mapping, based on the
assumption that crime predictions that the locations of
past events are good predictors of future events. This
approach generally includes two categories: methods that
rely on aggregate crime data and analysis of discrete
crime event locations. [15][16]The end result is visual
representation of crime points, thereby creating
continuous risk surface. Hot spot identific
on the size of the geographical area of concern, location
and addresses to streets, blocks and neighbourhoods. The
computation equation for KDE is as follows:
1, ,
1
We calculate crime density at each point p. We set
the following parameters:
‘h’ as bandwidth parameter
‘P’ as total number of crime incidents that occurred
‘j’ indicates single crime point from the data of crime
period
‘K’ is standard normal density function
||.|| is Euclidean norm
‘p(j)’ actual crime point.
Methods used for identifying crime hot spots are
divided into two general categories: Aggregate incident
location and analysis of point-pa
ii) Sentiment Analysis:
propose to study the feature
summarisation of criminal tweets. The assignment is
performed in two stages: First, recognize the highlights
of the tweets that people have tweeted. Second, for each
element we distinguish what number of tweets have
positive or negative opinions. The particular
express these conclusions are appended to the element.
This facilitates browsing of the sur
clients. In terminology identification, there are essentially
two techniques for finding terms in corpora: symbolic
approaches that depend on syntactic depiction of terms,
to be specific noun phrases, and factual methodologies
that make us of the way that the words
tend to be discovered near each other at reoccurring. The
system performs the summarisation in two main steps:
model uses estimated density and twitter features to
f incidents occurring on that
period in its neighbourhood. The graph plotted thus
shows the intensity of each crime in the city and hence
helps notifying the citizens about the appropriate
3. Related Work
be used are included and are
i) Kernel Density Estimation (KDE):An approach to
crime prediction is hotspot mapping, based on the
assumption that crime predictions that the locations of
past events are good predictors of future events. This
roach generally includes two categories: methods that
rely on aggregate crime data and analysis of discrete
crime event locations. [15][16]The end result is visual
representation of crime points, thereby creating
continuous risk surface. Hot spot identification depends
on the size of the geographical area of concern, location
and addresses to streets, blocks and neighbourhoods. The
computation equation for KDE is as follows:
∨ ∨
We calculate crime density at each point p. We set
the following parameters:
‘h’ as bandwidth parameter
‘P’ as total number of crime incidents that occurred
‘j’ indicates single crime point from the data of crime
period
nsity function
||.|| is Euclidean norm
‘p(j)’ actual crime point.
Methods used for identifying crime hot spots are
divided into two general categories: Aggregate incident
patterns.
) Sentiment Analysis: In this examination, we
propose to study the feature-based [17] opinion
summarisation of criminal tweets. The assignment is
performed in two stages: First, recognize the highlights
tweeted. Second, for each
element we distinguish what number of tweets have
positive or negative opinions. The particular surveys that
express these conclusions are appended to the element.
This facilitates browsing of the surveys by potential
In terminology identification, there are essentially
two techniques for finding terms in corpora: symbolic
approaches that depend on syntactic depiction of terms,
to be specific noun phrases, and factual methodologies
that make us of the way that the words making a term
tend to be discovered near each other at reoccurring. The
system performs the summarisation in two main steps:
International Journal of Pure and Applied Mathematics Special Issue
1778
3. ‘Feature Extraction’ and ‘Opinion orientation
identification’. Feature Extraction, given the data
sources, the framework initially downloads all the tweets,
and places them in the survey database. The feature
extraction work, first concentrates “hot” highlights that
many individuals have communicated their opinions in
their tweets, and after that finds the infrequent once.
Opinion orientation identification, function takes the
generated features and compresses the feelings of the
component into two classifications, positive and negative.
Frequent features generation, this step is to find features
that people are most interested in. In order to do this, we
use association rule mining to find all frequent itemsets.
An association rule[18] is an implication of the form X --
> Y, where X-->I, Y-->I, and X --> Y = Ø. The rule X--
>Y holds in D with confidence c. If c% of transactions in
D that support X also support Y. The rule has support s in
D if s% of the transactions in D contain X?Y. Mining
frequent occurring phrases, each bit of data extracted
above is put away in dataset called a transaction
set/document. We at that point run the association rule
miner, which depends on the Apriori algorithm. This
algorithm works in two stages, initial step is to discover
all frequent item sets from an arrangement of exchanges
that satisfy a user-specified minimum support. Feature
Purning Compactness pruning, method checks features
that contain at least two words, which we call feature
phrases, and remove those that are likely to be
meaningless. Redundancy Purning, we focus on
removing redundant features that contain single words.
Opinion Sentence orientation Identification, after opinion
features have been identified, we determine the semantic
orientation (i.e., positive or negative) of each opinion
sentence. This consists of two steps: (1) for each opinion
word in the opinion word list, we identify its semantic
orientation using a bootstrapping technique and the
WordNet and (2) we then decide the opinion orientation
of each sentence based on the dominant orientation of the
opinion words in the sentence.
iii) Prediction using sentimental polarity and historical
crime record as variables: Our goal is to predict future
crime on certain areas of the city. To predict the future
crime, we are interested in the trend of sentiment within
each neighbourhood along with sentiment polarity.
Intuitively, consecutive periods of positive or negative
sentiments might cause a greater risk of crime. So, to
measure the trend, we create a trend index. The
motivation of the algorithm comes from the pattern of
stock costs. The data available from twitter depends on
time. This means that each observation of a dataset has a
time tag attached to it. This kind of data is known as
‘time-series data’. The current ways to deal with this
issue, thrown in two gatherings: Feature Filters and
Feature wrappers.[19] The former are independent of the
modelling tool that will be used after the feature selection
phase. They basically try to use some statistical
properties of the features (ex. Correlation) to select the
final set of features. The latter approach include the
modelling tool in the selection process. They complete an
iterative search process where each iteration is a
candidate set of features is tried with the modelling tool
and the respective results are recorded. The decision of
models was predominantly guided by the way that these
techniques are well known by their ability to handle
highly nonlinear regression problems. Detailed
approaches to this domain would necessarily require a
large comparison of more alternatives. Two such
alternatives are listed be: Artificial Neural Network and
Support-Vector machine. (1) Artificial Neural
Networks(ANNs) are frequently used in financial
forecasting because of their ability to deal with highly
nonlinear problems. ANNs are formed by a set of
computing units (the neurons) linked to each other. (2)
Support Vector Machine, modelling tools that, as ANNs,
can be applied to both regression and classification tasks.
SVMs have been seeing expanded consideration from
various research groups in perspective of their fruitful
application to a few areas and furthermore their solid
hypothetical foundation. The fundamental idea behind
SVMs is that of mapping the original data into another,
high-dimensional space, where it is conceivable to apply
straight models to get an separating hyper plane, for
instance, isolating the classes of the problem, in the case
of classification tasks.
4. Conclusion
Public Security is major challenging issue. The machine
learning techniques can be efficiently integrated in the
field of crime investigation and thus help us to take
precautionary measure before the incidents occur. It will
be helpful to prevent major crime at very fast manner.
Machine learning techniques easily find and analyze
how crime is going to happen.
References
[1] HARDI. M. PATEL, RIPAL PATEL “Enhance
algoritham to predict a crime using datamining”, Journal
of Emerging Technologies and Innovative Research
(JETIR) ,Volume 4, Issue 04, April 2017 Pgno:257-259
[2] Rui Xia , Jie Jiang, and Huihui He “Distantly
Supervised Lifelong Learning for Large-Scale
SocialMedia Sentiment Analysis” IEEE transactions on
affective computing, vol. 8, no. 4, october-december
2017 ,480-491
International Journal of Pure and Applied Mathematics Special Issue
1779
4. [3] Harpreet Kaur, Veenu Mangat ,Nidhi ,”A survey
of sentiment analysis techniques”, International
conference on I-SMAC (IoT in Social, Mobile, Analytics
and Cloud) feb 2017, 921-925
[4] Metin Bilgin, Izzat Fatih ,”Analysis on Twitter
data with semi-supervised Doc2Vec, Computer Science
and Engineering (UBMK), 2017 International
Conference onYear: 2017,pgno:661-666
[5] Ankit Kumar Soni ,”Multi-lingual sentiment
analysis of Twitter data by using classification
algorithms”, Computer and Communication
Technologies (ICECCT), feb :2017,
[6] Ms.A.M.Abirami, Ms.V.Gayathri ,”A survey on
sentiment analysis methods and approach” 2016 IEEE
Eighth International Conference on Advanced
Computing (ICoAC) Jan 2017,pg.no:72-76
[7] Poornima Nedunchezhian, Shomona Gracia Jacob
“Social Influence Algorithms and Emotion Classification
for Prediction of Human Behavior: A Survey , “ Second
International Conference on Recent Trends and
Challenges in Computational Models (ICRTCCM), feb
:2017,pg.no: 55-60
[8] WalaaMedhata
AhmedHassanb
HodaKorashyb
“Sentiment analysis algorithms and applications: A
survey”, Volume 5, Issue 4, December 2014, Pages 1093-
1113
[9] Mohammad A. Tayebi, Martin Ester, Uwe Glasser
, Patricia L. Brantingham “CRIMETRACER: Activity
space based crime location prediction,” IEEE/ACM
International Conference on Advances in Social
Networks Analysis and Mining (ASONAM), Aug
2014,pg.no:472-480
[10] Anthony J. Corso, Gondy Leroy, Tucson
Abdulkareem Alsusdais, , Tucson Abdulkareem
Alsusdais “Toward Predictive Crime Analysis via Social
Media, Big Data, and GIS”, iconference 2015
[11] S.V.Manikanthan and T.Padmapriya “Recent
Trends In M2m Communications In 4g Networks And
Evolution Towards 5g”, International Journal of Pure and
Applied Mathematics, ISSN NO: 1314-3395, Vol-115,
Issue -8, Sep 2017.
[12] S.V. Manikanthan, T. Padmapriya “An enhanced
distributed evolved node-b architecture in 5G tele-
communications network” International Journal of
Engineering & Technology (UAE), Vol 7 Issues No (2.8)
(2018) 248-254.March2018.
[13] S.V. Manikanthan, T. Padmapriya, Relay Based
Architecture For Energy Perceptive For Mobile Adhoc
Networks, Advances and Applications in Mathematical
Sciences, Volume 17, Issue 1, November 2017, Pages
165-179
International Journal of Pure and Applied Mathematics Special Issue
1780