Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data
1. LOGO
A Comparative Study of Data
Mining Methods to Analyzing
Libyan National Crime Data
Presented by:
Dr.Zakaria Suliman Zubi
Associate Professor
Computer Science Department
Faculty of Science
Sirte University
Sirte, Libya
3. Abstract
Law enforcement agencies represented in the police today faced a
large volume of data every day. These data can be processed and
transformed into useful information. In this since, Data mining can be
applied to greatly improve crime analysis. Which can help to reduce
and preventing crime as much as possible.
Crime reports and data are used as an input for the formulation of the
crime prevention policies and strategic plans.
www.themegallery.com
This work will apply some data mining methods to analyses Libyan
national criminal record data to help the Libyan government to make
a strategically decision regarding prevention the increasing of the high
crime rate these days.
The data was collected manually from Benghazi, Tripoli, and AlJafara Supremes Security Committee (SSC).
Our proposed model will be able to extract crime patterns by using
association rule mining and clustering to classify crime records on the
basis of the values of crime attributes.
A comparison between both algorithms was discussed in this work as
well.
Company Logo
www.themegallery.com
5. Introduction
Data Mining or Knowledge Discovery in Databases (KDD) in simple
words is nontrivial extraction of implicit, previously unknown, and
potentially useful information from data.
KDD is the process of identifying a valid, potentially, useful and
ultimately understandable structure in data..
www.themegallery.com
Crime analyzes is an emerging field in law enforcement without
standard definitions. This makes it difficult to determine the crime
analyzes focus for agencies that are new to the field.
Crime analysis is act of analyzing crime. More specifically, crime
analysis is the breaking up of acts committed in violation of laws into
their parts to find out their nature and reporting, some analysis.
The role of the crime analysts varies from agency to agency.
Statement of these findings, The objective of most crime analysis is to
find meaningful information in vast amounts of data and disseminate
this information to officers and investigators in the field to assist in
their efforts to apprehend criminals and suppress criminal activity.
Company Logo
7. Data Mining Categories
The Data Mining models are categorized into different leaves. Further,
each leaf signifies the relationship, if any, that is highlighted from the
database. The Data Mining Models can be put into one of the six main
categories: 1) Association, 2) Classification, 3) Clustering, 4)
www.themegallery.com
Prediction, 5) Sequence Discovery, and 6) Generalization
Company Logo
www.themegallery.com
8. data Mining Categories Cont…
Table 2 classifies the various Data Mining algorithms according to
problem type, namely, Association, Classification, Clustering,
www.themegallery.com
Prediction, Discovery, and Summarization.
Company Logo
www.themegallery.com
9. data Mining Categories Cont…
www.themegallery.com
Cont….
Company Logo
www.themegallery.com
11. Why Analyze Crime?
Crime Analysts usually tend to justify their existence as crime analysts in
what is known as law enforcement agency. It makes sense to analyze
crime. Some good reasons are listed as follow:
1. Analyze crime to inform law enforcers about general and specific crime
trends, patterns, and series in an ongoing, timely manner.
2. Analyze crime to take advantage of the abundance of information existing in
law enforcement agencies, the criminal justice system, and public domain.
3. Analyze crime to maximize the use of limited law enforcement resources.
www.themegallery.com
4. Analyze crime to have an objective means to access crime problems locally,
regionally, nationally within and between law enforcement agencies.
5. Analyze crime to be proactive in detecting and preventing crime.
6. Analyze crime to meet the law enforcement needs of a changing society.
Company Logo
www.themegallery.com
12. www.themegallery.com
Why Analyze Crime? Cont…
In general there are four different techniques
for analyzing crimes, as follow:
1. Linkage Analysis
2. Statistical Analysis
3. Profiling
4. Spatial Analysis
Each of the above techniques has its own
advantages and drawbacks and can be used
in specific cases.
Company Logo
www.themegallery.com
14. Data Mining Task
A. Data collection Phase.
In this phase, the dataset that we used as training and testing
data were extracted from the police departments. These data
contain data about both Crimes and Criminals with the following
main attributes:
1.
2.
Crime ID: Individual crimes are designated by unique crime id.
Crime type: indicates crime type.
www.themegallery.com
3.
Date: Indicate when a crime happened.
4.
Gender: Male or Female.
5.
Age: age of individual Criminal.
6.
Crime Address: location of the crime.
7.
Marital status: status of the Criminal.
Figure2: Raw Data
Company Logo
www.themegallery.com
15. Data Mining Task Cont…
B. Data Preprocessing
Real world usually have the following drawbacks:
Incompleteness, Noisy, and Inconsistence. So these data need to
be preprocessed to get the data suitable for analysis purpose. The
preprocessing includes the following tasks :
1. Data cleaning.
www.themegallery.com
2. Data integration
3. Data transformation.
4. Data reduction.
Figure (3): attributes for crime and criminal. It also shows the distribution of
5. Data discretization.
offenses versus different crime and criminal attributes.
Company Logo
www.themegallery.com
17. Data Set
We will consider crime database as a training dataset
used in our model. The mentioned database contains a
real data values from crime and criminal attributes. We
will also consider 70 percent as training value of the
proposed model and 30 percent for testing. The
www.themegallery.com
following table shows the data we used in our model.
Company Logo
www.themegallery.com
19. The MLCR proposed model
The Mining Libyan Criminal Record (MLCR) proposed model will be implemented to conduct and interact with two types of mining algorithms to overcome with two
different types of results effectively. Those two approaches are considered as a sub-prototypes of the proposed MLCR model. Those prototypes will be illustrated as follows:
A. ining Libyan Criminal Record-using Association rules (MLCR-AR).
M
www.themegallery.com
B.Mining Libyan Criminal Record-using Clustering (MLCR-C).
Company Logo
www.themegallery.com
20. The MLCR proposed model Cont…
A. Mining Libyan Criminal Record-using
Association rules (MLCR-AR).
Association rule mining is a method used to generate rules from
crime dataset based on frequents occurrence of patterns to help the
decision makers of our security society to make a prevention
action.
www.themegallery.com
One of the most popular algorithm are called Apriori and FPgrowth Association rule mining classically intends at discovering
association between items in a transactional database.
The Apriori algorithm called also as “Sequential Algorithm”
developed by [Agrawal1994]. Is a great accomplishment in the
history of mining association rules[Cheung1996c]. It is also the
most well known association rules algorithm. This technique uses
to perform association analyze on the attributes of crimes.
Company Logo
www.themegallery.com
21. The MLCR proposed model Cont…
www.themegallery.com
1. MARRIED=NO Apriori Algorithm we’ve got the following [0] conv:
After applying the12 ==> GENDER=M 12<conf:(1)> lift:(1) lev:(0)results:
(0)
Apriori
2.CRIMEADDRESS=TRIPOLI 5 ==> GENDER=M 5 <conf:(1)> lift:(1) lev:
======
(0) [0] conv:(0) support: 0.3 (4 instances)
Minimum
3. CRIMEADDRESS=TRIPOLI 5 0.9 MARRIED=NO 5 <conf:(1)> lift:
Minimum metric <confidence>: ==>
(1.08) lev:(0.03) [0] conv:(0.38)
Number of cycles performed: 14
4.CRIMEADDRESS=TRIPOLI MARRIED=NO 5 ==> GENDER=M 5 <conf:
Generated sets of large itemsets:
(1)> lift:(1) lev:(0) [0] conv:(0)
Size of set of large itemsets L(1): 5
5. CRIMEADDRESS=TRIPOLI GENDER=M 5 ==> MARRIED=NO 5
Size of set of large itemsets [0] conv:(0.38)
<conf:(1)> lift:(1.08) lev:(0.03) L(2): 6
Size of set of large itemsets L(3): 2
6. CRIMEADDRESS=TRIPOLI 5 ==> GENDER=M MARRIED=NO 5
Best rules found:
<conf:(1)> lift:(1.08) lev:(0.03) [0] conv:(0.38)
7. CRIMEADDRESS=BENGHAZI 4 ==> GENDER=M 4 <conf:(1)> lift:(1)
lev:(0) [0] conv:(0)
8. CRIMEADDRESS=JAFARA 4 ==> GENDER=M 4 <conf:(1)> lift:(1) lev:
(0) [0] conv:(0)
9. CRIMEADDRESS=JAFARA 4 ==> MARRIED=NO 4 <conf:(1)> lift:
(1.08) lev:(0.02) [0] conv:(0.31)
10. CRIMEADDRESS=JAFARA MARRIED=NO 4 ==> GENDER=M 4
<conf:(1)> lift:(1) lev:(0) [0] conv:(0)
Company Logo
www.themegallery.com
22. The MLCR proposed model Cont…
B. Mining Libyan Criminal Record-using Clustering
(MLCR-C).
This prototype will use the same dataset indicated in MLCR_AR
prototype. But with Clustering Analysis.
Clustering is the technique that is used to group objects (crime and
K-mean algorithm clusters the data members groups
criminals) without having predefined specification for their attributes.
were m is predefined. Input-Crime type. Number of
clusters, Number unsupervised classification: no predefined classes. Simple KClustering is of Iteration Initial seeds might
means important role in the final results.
produce anclustering algorithm is used in this work.
www.themegallery.com
Step1: Randomly choose cluster centers.
Step2: Assign instance to cluster based on their
Distance to the cluster centers.
Step3: Centers of clusters are adjusted.
Step4: go to Step1 until convergence.
Step5: output X0 ,X1,X2 ,X3.
Fig4: criminal age vs. Number of crimes After
applying K-means algorithm
Company Logo
www.themegallery.com
23. The MLCR proposed model Cont…
www.themegallery.com
After Applying K-means Algorithm we’ve got this result
=== Clustering model (full training set) ===
K-Means======
Number of iterations: 3
Within cluster sum of squared errors: 43.0
Missing values globally replaced with mean/mode
Cluster centroids:
Time taken to build model (full training data) : 0 seconds
=== Model and evaluation on training set ===
Clustered Instances
0
1
Cluster#
Attribute
Full Data
0
1
(13)
(9)
(4)
===================================================
CRIMEID
13
13
65
CRIMETYPE
MOLESTATION MOLESTATION
DACOITY
CRIMEADDRESS
TRIPOLI JAFARA
TRIPOLI
CRIMEDATE
12OCT12 05NOV12
12OCT12
GENDER
M
M
M
MARRIED
NO
NO
NO
AGE
19
19
30
9 ( 69%)
4 ( 31%)
Company Logo
www.themegallery.com
25. A comparison between both algorithms
K-MEAN Algorithm
Aprioir Algorithm
Clustering techniques by using K-
Association rules by using Aprioir
algorithm mine frequent data
into classes based on
patterns by treating them as rules.
characteristics like age, gender.
The recognized benefits include an
For example, we can identify
improvement in the accuracy of
suspect and analyze those conduct
results over current semi-manual
crimes in similar fashion or the
www.themegallery.com
mean algorithm group data items
Processes and a reduction in the
same manner.
time taken to achieve those results
.
Company Logo
www.themegallery.com
27. Conclusion
Clustering and association rules were defined as a data mining techniques to
automatically retrieve, extract and evaluate information for knowledge
discovery from crime data.
This information was collected from many police department.
Association rules Mining is one of the data mining techniques for data to be
used to identify the relationship and to generate rules from crime dataset based
on frequents occurrence of patterns to help the decision makers of our security
society to make a prevention action.
www.themegallery.com
Clustering is one of the data mining techniques also used to group objects
(crime and criminals) without having predefined specification for their
attributes.
The algorithms such as K-means algorithm and Aproir algorithm are used in
this paper.
Those algorithms were expressed in details and a comparative study were
Company Logo
www.themegallery.com