Your SlideShare is downloading. ×
0
LOGO

A Comparative Study of Data
Mining Methods to Analyzing
Libyan National Crime Data
Presented by:
Dr.Zakaria Suliman ...
Contents

www.themegallery.com

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?
 Data Mining T...
Abstract
 Law enforcement agencies represented in the police today faced a
large volume of data every day. These data can...
Contents

www.themegallery.com

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?
 Data Mining T...
Introduction
 Data Mining or Knowledge Discovery in Databases (KDD) in simple
words is nontrivial extraction of implicit,...
Contents

www.themegallery.com

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?
 Data Mining T...
Data Mining Categories
 The Data Mining models are categorized into different leaves. Further,
each leaf signifies the re...
data Mining Categories Cont…
 Table 2 classifies the various Data Mining algorithms according to
problem type, namely, As...
data Mining Categories Cont…

www.themegallery.com

 Cont….

Company Logo
www.themegallery.com
Contents

www.themegallery.com

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?

 Data Mining ...
Why Analyze Crime?
Crime Analysts usually tend to justify their existence as crime analysts in
what is known as law enforc...
www.themegallery.com

Why Analyze Crime? Cont…
 In general there are four different techniques
for analyzing crimes, as f...
Contents

www.themegallery.com

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?
 Data Mining T...
Data Mining Task
A. Data collection Phase.
In this phase, the dataset that we used as training and testing
data were extra...
Data Mining Task Cont…
B. Data Preprocessing


Real world usually have the following drawbacks:
Incompleteness, Noisy, an...
Contents

www.themegallery.com

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?
 Data Mining T...
Data Set
We will consider crime database as a training dataset
used in our model. The mentioned database contains a
real d...
Contents

www.themegallery.com

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?
 Data Mining T...
The MLCR proposed model
The Mining Libyan Criminal Record (MLCR) proposed model will be implemented to conduct and interac...
The MLCR proposed model Cont…
A. Mining Libyan Criminal Record-using
Association rules (MLCR-AR).
 Association rule minin...
The MLCR proposed model Cont…

www.themegallery.com

1. MARRIED=NO Apriori Algorithm we’ve got the following [0] conv:
Aft...
The MLCR proposed model Cont…
B. Mining Libyan Criminal Record-using Clustering
(MLCR-C).


This prototype will use the s...
The MLCR proposed model Cont…

www.themegallery.com

After Applying K-means Algorithm we’ve got this result
 === Clusteri...
Contents

www.themegallery.com

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?
 Data Mining T...
A comparison between both algorithms
K-MEAN Algorithm

Aprioir Algorithm

Clustering techniques by using K-

Association r...
www.themegallery.com

Contents

 Abstract.
 Introduction.
 Data Mining Categories.
 Why Analyze Crime?
 Data Mining T...
Conclusion



Clustering and association rules were defined as a data mining techniques to
automatically retrieve, extrac...
Thank you !!!
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data
Upcoming SlideShare
Loading in...5
×

A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data

280

Published on

Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
280
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data "

  1. 1. LOGO A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime Data Presented by: Dr.Zakaria Suliman Zubi Associate Professor Computer Science Department Faculty of Science Sirte University Sirte, Libya
  2. 2. Contents www.themegallery.com  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used.  Conclusion. Company Logo
  3. 3. Abstract  Law enforcement agencies represented in the police today faced a large volume of data every day. These data can be processed and transformed into useful information. In this since, Data mining can be applied to greatly improve crime analysis. Which can help to reduce and preventing crime as much as possible.  Crime reports and data are used as an input for the formulation of the crime prevention policies and strategic plans. www.themegallery.com  This work will apply some data mining methods to analyses Libyan national criminal record data to help the Libyan government to make a strategically decision regarding prevention the increasing of the high crime rate these days.  The data was collected manually from Benghazi, Tripoli, and AlJafara Supremes Security Committee (SSC).  Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.  A comparison between both algorithms was discussed in this work as well. Company Logo www.themegallery.com
  4. 4. Contents www.themegallery.com  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used.  Conclusion. Company Logo
  5. 5. Introduction  Data Mining or Knowledge Discovery in Databases (KDD) in simple words is nontrivial extraction of implicit, previously unknown, and potentially useful information from data.  KDD is the process of identifying a valid, potentially, useful and ultimately understandable structure in data.. www.themegallery.com  Crime analyzes is an emerging field in law enforcement without standard definitions. This makes it difficult to determine the crime analyzes focus for agencies that are new to the field.  Crime analysis is act of analyzing crime. More specifically, crime analysis is the breaking up of acts committed in violation of laws into their parts to find out their nature and reporting, some analysis.  The role of the crime analysts varies from agency to agency. Statement of these findings, The objective of most crime analysis is to find meaningful information in vast amounts of data and disseminate this information to officers and investigators in the field to assist in their efforts to apprehend criminals and suppress criminal activity. Company Logo
  6. 6. Contents www.themegallery.com  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used.  Conclusion. Company Logo
  7. 7. Data Mining Categories  The Data Mining models are categorized into different leaves. Further, each leaf signifies the relationship, if any, that is highlighted from the database. The Data Mining Models can be put into one of the six main categories: 1) Association, 2) Classification, 3) Clustering, 4) www.themegallery.com Prediction, 5) Sequence Discovery, and 6) Generalization Company Logo www.themegallery.com
  8. 8. data Mining Categories Cont…  Table 2 classifies the various Data Mining algorithms according to problem type, namely, Association, Classification, Clustering, www.themegallery.com Prediction, Discovery, and Summarization. Company Logo www.themegallery.com
  9. 9. data Mining Categories Cont… www.themegallery.com  Cont…. Company Logo www.themegallery.com
  10. 10. Contents www.themegallery.com  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used.  Conclusion. Company Logo
  11. 11. Why Analyze Crime? Crime Analysts usually tend to justify their existence as crime analysts in what is known as law enforcement agency. It makes sense to analyze crime. Some good reasons are listed as follow: 1. Analyze crime to inform law enforcers about general and specific crime trends, patterns, and series in an ongoing, timely manner. 2. Analyze crime to take advantage of the abundance of information existing in law enforcement agencies, the criminal justice system, and public domain. 3. Analyze crime to maximize the use of limited law enforcement resources. www.themegallery.com 4. Analyze crime to have an objective means to access crime problems locally, regionally, nationally within and between law enforcement agencies. 5. Analyze crime to be proactive in detecting and preventing crime. 6. Analyze crime to meet the law enforcement needs of a changing society. Company Logo www.themegallery.com
  12. 12. www.themegallery.com Why Analyze Crime? Cont…  In general there are four different techniques for analyzing crimes, as follow: 1. Linkage Analysis 2. Statistical Analysis 3. Profiling 4. Spatial Analysis  Each of the above techniques has its own advantages and drawbacks and can be used in specific cases. Company Logo www.themegallery.com
  13. 13. Contents www.themegallery.com  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used.  Conclusion. Company Logo
  14. 14. Data Mining Task A. Data collection Phase. In this phase, the dataset that we used as training and testing data were extracted from the police departments. These data contain data about both Crimes and Criminals with the following main attributes: 1. 2. Crime ID: Individual crimes are designated by unique crime id. Crime type: indicates crime type. www.themegallery.com 3. Date: Indicate when a crime happened. 4. Gender: Male or Female. 5. Age: age of individual Criminal. 6. Crime Address: location of the crime. 7. Marital status: status of the Criminal. Figure2: Raw Data Company Logo www.themegallery.com
  15. 15. Data Mining Task Cont… B. Data Preprocessing  Real world usually have the following drawbacks: Incompleteness, Noisy, and Inconsistence. So these data need to be preprocessed to get the data suitable for analysis purpose. The preprocessing includes the following tasks : 1. Data cleaning. www.themegallery.com 2. Data integration 3. Data transformation. 4. Data reduction. Figure (3): attributes for crime and criminal. It also shows the distribution of 5. Data discretization. offenses versus different crime and criminal attributes. Company Logo www.themegallery.com
  16. 16. Contents www.themegallery.com  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used.  Conclusion. Company Logo
  17. 17. Data Set We will consider crime database as a training dataset used in our model. The mentioned database contains a real data values from crime and criminal attributes. We will also consider 70 percent as training value of the proposed model and 30 percent for testing. The www.themegallery.com following table shows the data we used in our model. Company Logo www.themegallery.com
  18. 18. Contents www.themegallery.com  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used. Company Logo
  19. 19. The MLCR proposed model The Mining Libyan Criminal Record (MLCR) proposed model will be implemented to conduct and interact with two types of mining algorithms to overcome with two different types of results effectively. Those two approaches are considered as a sub-prototypes of the proposed MLCR model. Those prototypes will be illustrated as follows: A. ining Libyan Criminal Record-using Association rules (MLCR-AR). M www.themegallery.com B.Mining Libyan Criminal Record-using Clustering (MLCR-C). Company Logo www.themegallery.com
  20. 20. The MLCR proposed model Cont… A. Mining Libyan Criminal Record-using Association rules (MLCR-AR).  Association rule mining is a method used to generate rules from crime dataset based on frequents occurrence of patterns to help the decision makers of our security society to make a prevention action. www.themegallery.com  One of the most popular algorithm are called Apriori and FPgrowth Association rule mining classically intends at discovering association between items in a transactional database.  The Apriori algorithm called also as “Sequential Algorithm” developed by [Agrawal1994]. Is a great accomplishment in the history of mining association rules[Cheung1996c]. It is also the most well known association rules algorithm. This technique uses to perform association analyze on the attributes of crimes. Company Logo www.themegallery.com
  21. 21. The MLCR proposed model Cont… www.themegallery.com 1. MARRIED=NO Apriori Algorithm we’ve got the following [0] conv: After applying the12 ==> GENDER=M 12<conf:(1)> lift:(1) lev:(0)results: (0) Apriori 2.CRIMEADDRESS=TRIPOLI 5 ==> GENDER=M 5 <conf:(1)> lift:(1) lev: ====== (0) [0] conv:(0) support: 0.3 (4 instances) Minimum 3. CRIMEADDRESS=TRIPOLI 5 0.9 MARRIED=NO 5 <conf:(1)> lift: Minimum metric <confidence>: ==> (1.08) lev:(0.03) [0] conv:(0.38) Number of cycles performed: 14 4.CRIMEADDRESS=TRIPOLI MARRIED=NO 5 ==> GENDER=M 5 <conf: Generated sets of large itemsets: (1)> lift:(1) lev:(0) [0] conv:(0) Size of set of large itemsets L(1): 5 5. CRIMEADDRESS=TRIPOLI GENDER=M 5 ==> MARRIED=NO 5 Size of set of large itemsets [0] conv:(0.38) <conf:(1)> lift:(1.08) lev:(0.03) L(2): 6 Size of set of large itemsets L(3): 2 6. CRIMEADDRESS=TRIPOLI 5 ==> GENDER=M MARRIED=NO 5 Best rules found: <conf:(1)> lift:(1.08) lev:(0.03) [0] conv:(0.38) 7. CRIMEADDRESS=BENGHAZI 4 ==> GENDER=M 4 <conf:(1)> lift:(1) lev:(0) [0] conv:(0) 8. CRIMEADDRESS=JAFARA 4 ==> GENDER=M 4 <conf:(1)> lift:(1) lev: (0) [0] conv:(0) 9. CRIMEADDRESS=JAFARA 4 ==> MARRIED=NO 4 <conf:(1)> lift: (1.08) lev:(0.02) [0] conv:(0.31) 10. CRIMEADDRESS=JAFARA MARRIED=NO 4 ==> GENDER=M 4 <conf:(1)> lift:(1) lev:(0) [0] conv:(0) Company Logo www.themegallery.com
  22. 22. The MLCR proposed model Cont… B. Mining Libyan Criminal Record-using Clustering (MLCR-C).  This prototype will use the same dataset indicated in MLCR_AR prototype. But with Clustering Analysis. Clustering is the technique that is used to group objects (crime and K-mean algorithm clusters the data members groups criminals) without having predefined specification for their attributes. were m is predefined. Input-Crime type. Number of  clusters, Number unsupervised classification: no predefined classes. Simple KClustering is of Iteration Initial seeds might means important role in the final results. produce anclustering algorithm is used in this work. www.themegallery.com  Step1: Randomly choose cluster centers. Step2: Assign instance to cluster based on their Distance to the cluster centers. Step3: Centers of clusters are adjusted. Step4: go to Step1 until convergence. Step5: output X0 ,X1,X2 ,X3. Fig4: criminal age vs. Number of crimes After applying K-means algorithm Company Logo www.themegallery.com
  23. 23. The MLCR proposed model Cont… www.themegallery.com After Applying K-means Algorithm we’ve got this result  === Clustering model (full training set) ===  K-Means======  Number of iterations: 3  Within cluster sum of squared errors: 43.0  Missing values globally replaced with mean/mode             Cluster centroids:  Time taken to build model (full training data) : 0 seconds  === Model and evaluation on training set ===  Clustered Instances   0 1 Cluster# Attribute Full Data 0 1 (13) (9) (4) =================================================== CRIMEID 13 13 65 CRIMETYPE MOLESTATION MOLESTATION DACOITY CRIMEADDRESS TRIPOLI JAFARA TRIPOLI CRIMEDATE 12OCT12 05NOV12 12OCT12 GENDER M M M MARRIED NO NO NO AGE 19 19 30 9 ( 69%) 4 ( 31%) Company Logo www.themegallery.com
  24. 24. Contents www.themegallery.com  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used.  Conclusion. Company Logo
  25. 25. A comparison between both algorithms K-MEAN Algorithm Aprioir Algorithm Clustering techniques by using K- Association rules by using Aprioir algorithm mine frequent data into classes based on patterns by treating them as rules. characteristics like age, gender. The recognized benefits include an For example, we can identify improvement in the accuracy of suspect and analyze those conduct results over current semi-manual crimes in similar fashion or the www.themegallery.com mean algorithm group data items Processes and a reduction in the same manner. time taken to achieve those results . Company Logo www.themegallery.com
  26. 26. www.themegallery.com Contents  Abstract.  Introduction.  Data Mining Categories.  Why Analyze Crime?  Data Mining Task.  Data Set.  The MLCR Proposed Model.  Comparison Between Algorithms Used.  Conclusion. Company Logo
  27. 27. Conclusion  Clustering and association rules were defined as a data mining techniques to automatically retrieve, extract and evaluate information for knowledge   discovery from crime data. This information was collected from many police department. Association rules Mining is one of the data mining techniques for data to be used to identify the relationship and to generate rules from crime dataset based on frequents occurrence of patterns to help the decision makers of our security society to make a prevention action. www.themegallery.com  Clustering is one of the data mining techniques also used to group objects (crime and criminals) without having predefined specification for their attributes.  The algorithms such as K-means algorithm and Aproir algorithm are used in this paper.  Those algorithms were expressed in details and a comparative study were Company Logo www.themegallery.com
  28. 28. Thank you !!!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×