CRIME
PREDICTION
PROJECT
MACHINE LEARNING PROJECT
Abstract
Crime is one of the serious issues in our society. It is the most predominant aspect of our society. It
is also predominant in society. So, the prevention of crime is one of the important tasks. The crime
analysis should be in a systematic way. As the analysis makes it important in the detecting and
prevention of crime. The analysis detects the investigating patterns and helps in the detection of
trends in crime. The main of this project is the analysis of the efficiency of the crime investigation.
The model is designed for the detection of crime patterns from inferences. The inferences are
collected from the crime scene and these inferences, the paper demonstrates the prediction that will
happen. The dataset used in this paper is taken from the” Communities and Crime dataset” from UCI
repository . There is a need for analysing the crime data to lower the crime rate. This helps the
police and citizens to take necessary actions and solve the crimes faster. In this paper, data mining
techniques are applied to crime data for predicting features that affect the high crime rate
Nutshell
A crime is a deliberate act that can cause physical or psychological harm, as well as property damage or loss,
and can lead to punishment by a state or other authority according to the severity of the crime. The number and
forms of criminal activities are increasing at an alarming rate, forcing agencies to develop efficient methods to
take preventive measures. In the current scenario of rapidly increasing crime, traditional crime-solving
techniques are unable to deliver results, being slow paced and less efficient. Thus, if we can come up with ways
to predict crime, in detail, before it occurs, or come up with a “machine” that can assist police officers, it would
lift the burden of police and help in preventing crimes. To achieve this, we suggest including machine learning
(ML) and computer vision algorithms and techniques. In this paper, we describe the results of certain cases
where such approaches were used, and which motivated us to pursue further research in this field. The main
reason for the change in crime detection and prevention lies in the before and after statistical observations of
the authorities using such techniques. The sole purpose of this study is to determine how a ML can be used by
law agencies or authorities to detect, prevent, and solve crimes at a much more accurate and faster rate
Crime?
Crime, in a way, influences organizations and institutions when occurred frequently in a society. Thus, it is necessary to study
the factors and relations between different crimes and to find a way to accurately predict and avoid these crimes. Recently
law enforcement agencies have been moving towards a more empirical, data driven approach to predictive policing. However,
even with new data-driven approaches to predict crime, the fundamental job of crime analysts still remains difficult and often
manual; specific patterns of crime are not very easy to find by way of automated tools, whereas larger-scale density-based
trends comprised mainly of background crime levels are much easier for data-driven approaches and software to estimate.
Crime predictions can be made through both qualitative and quantitative methods. Qualitative approaches to forecasting
crime, as environmental scanning, scenario writing, are useful in identifying the future nature of criminal activity. In contrast,
perceptible methods are used to predict the future scope of crime and more specifically, crime rates a common method for
develop forecasts is to projects annual crime rate trends developed through time series models.
Algorithms
Here the algorithms that we choose are:
∙ Decision tree
∙ KNN
∙ Linear SVC
∙ Gaussian Naïve Bayes classification
∙ Polynomial SVC
∙ Random forest
∙ Gradient boosting classification
Architecture
Libraries?
Pandas: It is an open source library that provides high-performance, easy-
to use data structures and data analysis tools for the Python programming
language. It stores the tabular, matrix data into rows and columns using data
frames in Python which helps to process the data dynamically.
Numpy: It is a powerful package for scientific computing in Python. It can be
used by creating an N-dimensional object array usually represented by np.
Scikit Learn: Most useful and robust library for machine learning in Python.
It provides a selection of efficient tools for machine learning and statistical
modeling including classification, regression, clustering and dimensionality
reduction via a consistence interface in Python.
Algorithm Implementation
DECISION TREE CLASSIFIER
Decision tree model is a supervised learning method of classification. The goal
of this classifier is to create a model that predicts the value of a target variable
by learning simple decision rules inferred from the data features. A deeper tree
indicates complex decision rules and that makes a better fitted model. Entropy
gives the information gain from a decision rule. Hence for this predictive
model, Entropy as a criterion for splitting of branches has been used.
KNN CLASSIFIER
KNN algorithms use data and classify new data points based on similarity
measures (e.g. distance function). Classification is done by a majority vote to its
neighbors.
conti.
LINEAR SVC
algorithm for solving multiclass classification problems from ultra large data sets that
implements an original proprietary version of a cutting plane algorithm for designing a linear
support vector machine.
GAUSSIAN NB CLASSIFIER
GaussianNB module make use of Naïve Bayes theorem that are a set of supervised learning
algorithms with the “naive” assumption of independence between every pair of features. If y is
an attribute to be predicted and X are the attributes used for prediction.
POLYNOMIAL SVC
In machine learning, the polynomial kernel is a kernel function commonly used with support
vector machines (SVMs) and other kernelized models, that represents the similarity of vectors
(training samples) in a feature space over polynomials of the original variables, allowing
learning of non-linear models.
conti.
RANDOM FOREST CLASSIFIER
This classifier fits n number of decision tree classifiers on various sub-samples
of the dataset, controls the over-fitting of data and improves predictive accuracy
by averaging. This classifier has been used to gain a good accuracy over
singular decision trees obtained in 3.
GRADIENT BOOSTING CLASSIFIER
group of machine learning algorithms that combine many weak learning
models together to create a strong predictive model. Decision trees are usually
used when doing gradient boosting. The Gradient Boosting Classifier depends
on a loss function.
Refrences
[1] Shiju Sathyadevan, Devan M, Surya Gangadharan.
Analysis and Prediction Using Data Mining, 2014 First International Conference on Networks & Soft
Computing.
[2] Abba Babakura, Md Nasir Sulaiman and Mahmud A. Yusuf, Improved Method of Classification
Algorithms
for Crime Prediction,2014 International Symposium on Biometrics and Security Technologies
(ISBAST).
[3] Jazeem Azeez, D. John Aravindhar, Hybrid Approach to Crime Prediction using Deep learning, 2015
International Conference on Advances in Computing, Communications and Informatics (ICACCI).
[4] Sathyadevan, S., & Gangadharan, S. (2014, August). Crime analysis and prediction using data mining.
In Networks & Soft Computing (ICNSC), 2014 First International Conference on (pp. 406-412). IEEE.
[5] Nath, S. V. (2006, December). Crime pattern detection using data mining. In Web intelligence and
intelligent agent technology workshops, 2006. wi-iat 2006 workshops. 2006 ieee/wic/acm international
conference on (pp. 41-44). IEEE.
[6] Zhao, X., & Tang, J. (2017, November). Exploring Transfer Learning for Crime Prediction. In Data
Mining Workshops (ICDMW), 2017 IEEE International Conference on (pp. 1158-1159). IEEE.
CRIME PREDICTION prediction and analysisPROJECT.pptx

CRIME PREDICTION prediction and analysisPROJECT.pptx

  • 1.
  • 3.
    Abstract Crime is oneof the serious issues in our society. It is the most predominant aspect of our society. It is also predominant in society. So, the prevention of crime is one of the important tasks. The crime analysis should be in a systematic way. As the analysis makes it important in the detecting and prevention of crime. The analysis detects the investigating patterns and helps in the detection of trends in crime. The main of this project is the analysis of the efficiency of the crime investigation. The model is designed for the detection of crime patterns from inferences. The inferences are collected from the crime scene and these inferences, the paper demonstrates the prediction that will happen. The dataset used in this paper is taken from the” Communities and Crime dataset” from UCI repository . There is a need for analysing the crime data to lower the crime rate. This helps the police and citizens to take necessary actions and solve the crimes faster. In this paper, data mining techniques are applied to crime data for predicting features that affect the high crime rate
  • 5.
    Nutshell A crime isa deliberate act that can cause physical or psychological harm, as well as property damage or loss, and can lead to punishment by a state or other authority according to the severity of the crime. The number and forms of criminal activities are increasing at an alarming rate, forcing agencies to develop efficient methods to take preventive measures. In the current scenario of rapidly increasing crime, traditional crime-solving techniques are unable to deliver results, being slow paced and less efficient. Thus, if we can come up with ways to predict crime, in detail, before it occurs, or come up with a “machine” that can assist police officers, it would lift the burden of police and help in preventing crimes. To achieve this, we suggest including machine learning (ML) and computer vision algorithms and techniques. In this paper, we describe the results of certain cases where such approaches were used, and which motivated us to pursue further research in this field. The main reason for the change in crime detection and prevention lies in the before and after statistical observations of the authorities using such techniques. The sole purpose of this study is to determine how a ML can be used by law agencies or authorities to detect, prevent, and solve crimes at a much more accurate and faster rate
  • 6.
    Crime? Crime, in away, influences organizations and institutions when occurred frequently in a society. Thus, it is necessary to study the factors and relations between different crimes and to find a way to accurately predict and avoid these crimes. Recently law enforcement agencies have been moving towards a more empirical, data driven approach to predictive policing. However, even with new data-driven approaches to predict crime, the fundamental job of crime analysts still remains difficult and often manual; specific patterns of crime are not very easy to find by way of automated tools, whereas larger-scale density-based trends comprised mainly of background crime levels are much easier for data-driven approaches and software to estimate. Crime predictions can be made through both qualitative and quantitative methods. Qualitative approaches to forecasting crime, as environmental scanning, scenario writing, are useful in identifying the future nature of criminal activity. In contrast, perceptible methods are used to predict the future scope of crime and more specifically, crime rates a common method for develop forecasts is to projects annual crime rate trends developed through time series models.
  • 7.
    Algorithms Here the algorithmsthat we choose are: ∙ Decision tree ∙ KNN ∙ Linear SVC ∙ Gaussian Naïve Bayes classification ∙ Polynomial SVC ∙ Random forest ∙ Gradient boosting classification
  • 8.
  • 9.
    Libraries? Pandas: It isan open source library that provides high-performance, easy- to use data structures and data analysis tools for the Python programming language. It stores the tabular, matrix data into rows and columns using data frames in Python which helps to process the data dynamically. Numpy: It is a powerful package for scientific computing in Python. It can be used by creating an N-dimensional object array usually represented by np. Scikit Learn: Most useful and robust library for machine learning in Python. It provides a selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction via a consistence interface in Python.
  • 10.
    Algorithm Implementation DECISION TREECLASSIFIER Decision tree model is a supervised learning method of classification. The goal of this classifier is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features. A deeper tree indicates complex decision rules and that makes a better fitted model. Entropy gives the information gain from a decision rule. Hence for this predictive model, Entropy as a criterion for splitting of branches has been used. KNN CLASSIFIER KNN algorithms use data and classify new data points based on similarity measures (e.g. distance function). Classification is done by a majority vote to its neighbors.
  • 11.
    conti. LINEAR SVC algorithm forsolving multiclass classification problems from ultra large data sets that implements an original proprietary version of a cutting plane algorithm for designing a linear support vector machine. GAUSSIAN NB CLASSIFIER GaussianNB module make use of Naïve Bayes theorem that are a set of supervised learning algorithms with the “naive” assumption of independence between every pair of features. If y is an attribute to be predicted and X are the attributes used for prediction. POLYNOMIAL SVC In machine learning, the polynomial kernel is a kernel function commonly used with support vector machines (SVMs) and other kernelized models, that represents the similarity of vectors (training samples) in a feature space over polynomials of the original variables, allowing learning of non-linear models.
  • 12.
    conti. RANDOM FOREST CLASSIFIER Thisclassifier fits n number of decision tree classifiers on various sub-samples of the dataset, controls the over-fitting of data and improves predictive accuracy by averaging. This classifier has been used to gain a good accuracy over singular decision trees obtained in 3. GRADIENT BOOSTING CLASSIFIER group of machine learning algorithms that combine many weak learning models together to create a strong predictive model. Decision trees are usually used when doing gradient boosting. The Gradient Boosting Classifier depends on a loss function.
  • 13.
    Refrences [1] Shiju Sathyadevan,Devan M, Surya Gangadharan. Analysis and Prediction Using Data Mining, 2014 First International Conference on Networks & Soft Computing. [2] Abba Babakura, Md Nasir Sulaiman and Mahmud A. Yusuf, Improved Method of Classification Algorithms for Crime Prediction,2014 International Symposium on Biometrics and Security Technologies (ISBAST). [3] Jazeem Azeez, D. John Aravindhar, Hybrid Approach to Crime Prediction using Deep learning, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI). [4] Sathyadevan, S., & Gangadharan, S. (2014, August). Crime analysis and prediction using data mining. In Networks & Soft Computing (ICNSC), 2014 First International Conference on (pp. 406-412). IEEE. [5] Nath, S. V. (2006, December). Crime pattern detection using data mining. In Web intelligence and intelligent agent technology workshops, 2006. wi-iat 2006 workshops. 2006 ieee/wic/acm international conference on (pp. 41-44). IEEE. [6] Zhao, X., & Tang, J. (2017, November). Exploring Transfer Learning for Crime Prediction. In Data Mining Workshops (ICDMW), 2017 IEEE International Conference on (pp. 1158-1159). IEEE.