This document summarizes research analyzing crime data from Atlanta and Georgia Tech. It discusses using patrol analysis and identifying hot spots to optimize police patrol routes. Time series analysis of crime data revealed seasonal patterns, with some crime types peaking in September. Hot spot analysis identified concentrated areas of crime in Atlanta using statistical tests, with the nearest neighbor index method most accurately representing hot spots. In conclusion, optimizing patrol routes based on crime patterns and hot spots could lower crime rates and improve police efficiency.
This report details applying the PPAC (Police Patrol Allocation and Coverage) model to crime data from the Georgia Tech Police Department from 2011-2014. The report cleans the data, formulates the PPAC optimization model to maximize police coverage of areas, particularly high crime "hot spots." The model generates optimal patrol zones. Future work involves re-zoning patrol areas based on the results and incorporating a time component to optimize patrol zones over time.
This document proposes reallocating the Georgia Tech Police Department's (GTPD) patrol zones based on analyzing past crime patterns to make the campus safer. The research team has analyzed four years of crime data using clustering algorithms and time series analysis. They found crime clusters, relationships between certain crimes, and were able to predict future crime locations and types. The current 4 patrol zones are inefficient as Zone 2 has noticeably more crimes. The team aims to strategically define new zones incorporating their findings to suggest more reasonable patrol routes and make crime occurrences more uniform across zones. Their goal is to increase patrol efficiency and encounter more criminals to improve campus safety. They will clean the data, analyze crime patterns, predict future crimes, optimize patrol zone
Predictive policing uses statistical analysis and data to predict criminal activity and identify crime patterns in order to prevent future crimes and solve past cases. It relies on the idea that criminals tend to operate within a "comfort zone" and commit similar crimes in similar locations. Predictive policing involves collecting large data sets, analyzing the data to identify crime hot spots or individuals at risk of offending, intervening through police operations, and assessing the results to continue refining predictions. While predictive policing shows promise, its effectiveness depends on proper implementation and action based on predictions, and it has certain limitations in predicting some types of crimes.
PredPol: How Predictive Policing WorksPredPol, Inc
PredPol’s cloud-based predictive policing software enables law enforcement agencies to better prevent crime in their communities by generating predictions on the places and times that future crimes are most likely to occur.
PredPol’s technology has been helping law enforcement agencies to dramatically reduce crime in jurisdictions of all types and sizes, across the U.S. and overseas. Over the past year, Atlanta and Los Angeles have reduced specific crimes in targeted areas at rates ranging from nearly 20% to over 40%. Smaller jurisdictions, such as Norcross, Georgia, have seen nearly a 30% reduction in burglaries and robberies; in Alhambra, California, car burglaries have dropped 20% since the software technology was deployed.
Using advanced mathematics and computer learning, PredPol’s algorithms predict many types of crime, including property crimes, drug incidents, gang activity, and gun violence as well as traffic accidents.
Only three pieces of data are used to make predictions – type of crime, place of crime, and time of crime. No personal data is utilized in making these predictions.
Crime analysts and command staff using PredPol are 100% more effective than they are with traditional hotspot mapping at predicting where and when crimes are likely to occur. That means police have twice as many opportunities to deter and reduce crime.
Predictive analysis of crime forecastingFrank Smilda
This document discusses various methods for predictive crime mapping, beginning with simply using past crime "hot spots" as a predictor of future hot spots. While this approach has limited accuracy over short periods, past hot spots can predict up to 90% of future crime over longer periods like a year. The document then reviews more sophisticated predictive modeling methods and the role of geographic information systems in developing spatial models to predict crime.
Predictive policing computational thinking show and tellArchit Sharma
Predictive policing uses advanced data analysis and technology to predict where and when crimes are likely to occur based on patterns in historical crime data. The predictions are used to more efficiently deploy law enforcement resources to targeted areas in an effort to prevent crimes before they happen. Predictive policing algorithms analyze large datasets on past crimes, including details like type of crime, location, and timing, to identify patterns and assign probabilities of future criminal activity to specific regions.
Predictive Policing on Gun Violence Using Open DataPredPol, Inc
This presentation is an abstract of a 2013 whitepaper published by PredPol.
PredPol delivers the same predictive accuracy for gun violence using unique mathematical methods. A study of Chicago data shows that PredPol successfully predicts 50% of gun homicides by flagging in real-time only 10.3% of city locations. Knowing where and when gun homicides are most likely to occur empowers law enforcement to use their knowledge, skills and experience to disrupt gun crime before it happens.
The study uses open government data from Chicago and predictive crime analysis.
For the full whitepaper, visit predpol.com & request information.
This document summarizes a crime analysis project conducted by a university team. The team analyzed multiple data sets to build models predicting crime rates based on factors like population, weather, daylight hours, and economic indicators. They created binary, numeric, and crime ratio models and found the crime ratio model was most statistically significant. The team's analyses found crime rates generally increased with daylight savings time and increased slightly with higher temperatures. Their best model could predict monthly crime totals by city for most crime types except rare crimes like homicide and sexual assault.
This report details applying the PPAC (Police Patrol Allocation and Coverage) model to crime data from the Georgia Tech Police Department from 2011-2014. The report cleans the data, formulates the PPAC optimization model to maximize police coverage of areas, particularly high crime "hot spots." The model generates optimal patrol zones. Future work involves re-zoning patrol areas based on the results and incorporating a time component to optimize patrol zones over time.
This document proposes reallocating the Georgia Tech Police Department's (GTPD) patrol zones based on analyzing past crime patterns to make the campus safer. The research team has analyzed four years of crime data using clustering algorithms and time series analysis. They found crime clusters, relationships between certain crimes, and were able to predict future crime locations and types. The current 4 patrol zones are inefficient as Zone 2 has noticeably more crimes. The team aims to strategically define new zones incorporating their findings to suggest more reasonable patrol routes and make crime occurrences more uniform across zones. Their goal is to increase patrol efficiency and encounter more criminals to improve campus safety. They will clean the data, analyze crime patterns, predict future crimes, optimize patrol zone
Predictive policing uses statistical analysis and data to predict criminal activity and identify crime patterns in order to prevent future crimes and solve past cases. It relies on the idea that criminals tend to operate within a "comfort zone" and commit similar crimes in similar locations. Predictive policing involves collecting large data sets, analyzing the data to identify crime hot spots or individuals at risk of offending, intervening through police operations, and assessing the results to continue refining predictions. While predictive policing shows promise, its effectiveness depends on proper implementation and action based on predictions, and it has certain limitations in predicting some types of crimes.
PredPol: How Predictive Policing WorksPredPol, Inc
PredPol’s cloud-based predictive policing software enables law enforcement agencies to better prevent crime in their communities by generating predictions on the places and times that future crimes are most likely to occur.
PredPol’s technology has been helping law enforcement agencies to dramatically reduce crime in jurisdictions of all types and sizes, across the U.S. and overseas. Over the past year, Atlanta and Los Angeles have reduced specific crimes in targeted areas at rates ranging from nearly 20% to over 40%. Smaller jurisdictions, such as Norcross, Georgia, have seen nearly a 30% reduction in burglaries and robberies; in Alhambra, California, car burglaries have dropped 20% since the software technology was deployed.
Using advanced mathematics and computer learning, PredPol’s algorithms predict many types of crime, including property crimes, drug incidents, gang activity, and gun violence as well as traffic accidents.
Only three pieces of data are used to make predictions – type of crime, place of crime, and time of crime. No personal data is utilized in making these predictions.
Crime analysts and command staff using PredPol are 100% more effective than they are with traditional hotspot mapping at predicting where and when crimes are likely to occur. That means police have twice as many opportunities to deter and reduce crime.
Predictive analysis of crime forecastingFrank Smilda
This document discusses various methods for predictive crime mapping, beginning with simply using past crime "hot spots" as a predictor of future hot spots. While this approach has limited accuracy over short periods, past hot spots can predict up to 90% of future crime over longer periods like a year. The document then reviews more sophisticated predictive modeling methods and the role of geographic information systems in developing spatial models to predict crime.
Predictive policing computational thinking show and tellArchit Sharma
Predictive policing uses advanced data analysis and technology to predict where and when crimes are likely to occur based on patterns in historical crime data. The predictions are used to more efficiently deploy law enforcement resources to targeted areas in an effort to prevent crimes before they happen. Predictive policing algorithms analyze large datasets on past crimes, including details like type of crime, location, and timing, to identify patterns and assign probabilities of future criminal activity to specific regions.
Predictive Policing on Gun Violence Using Open DataPredPol, Inc
This presentation is an abstract of a 2013 whitepaper published by PredPol.
PredPol delivers the same predictive accuracy for gun violence using unique mathematical methods. A study of Chicago data shows that PredPol successfully predicts 50% of gun homicides by flagging in real-time only 10.3% of city locations. Knowing where and when gun homicides are most likely to occur empowers law enforcement to use their knowledge, skills and experience to disrupt gun crime before it happens.
The study uses open government data from Chicago and predictive crime analysis.
For the full whitepaper, visit predpol.com & request information.
This document summarizes a crime analysis project conducted by a university team. The team analyzed multiple data sets to build models predicting crime rates based on factors like population, weather, daylight hours, and economic indicators. They created binary, numeric, and crime ratio models and found the crime ratio model was most statistically significant. The team's analyses found crime rates generally increased with daylight savings time and increased slightly with higher temperatures. Their best model could predict monthly crime totals by city for most crime types except rare crimes like homicide and sexual assault.
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
As per studies conducted by the University of California, it is observed that crime in any area follows the same pattern as that of earthquake aftershocks. It is difficult to predict an earthquake, but once it happens the aftershocks following it are quite predictable. Same is true for the crimes happening in a geographical area.
This document discusses using data mining techniques like clustering to detect crime patterns from crime data. It proposes using a k-means clustering algorithm with attribute weighting to group similar crimes. Testing on real crime data from a sheriff's office, it was able to identify crime patterns that detectives could validate matched actual crime sprees. The method provides an automated way to detect patterns and help detectives solve crimes faster by focusing on clustered groups of related incidents.
Crime Pattern Detection using K-Means ClusteringReuben George
Crime pattern detection uses data mining techniques like clustering to analyze crime data and identify patterns. This involves plotting past crimes geographically, clustering similar crimes to detect sprees, and analyzing the results to draw conclusions. It helps improve crime solving by learning from history and preempting future crimes. The method augments detectives' work but has limitations like relying on data quality. Overall, crime pattern detection aids operational efficiency and enhancing resolution rates by optimizing resource deployment based on observed crime trends.
GIS aids crime analysis by identifying patterns and trends, supporting intelligence-led policing strategies, and integrating diverse data sources. It enhances crime analysis by highlighting suspicious incidents, supporting cross-jurisdictional pattern analysis, and educating the public. GIS provides tools to capture crime series, forecast crime, and optimize resource allocation to reduce crime and disorder.
This document discusses the application of geographic information systems (GIS) in criminology and defense intelligence. It provides examples of how GIS has been used to map crime rates and identify spatial patterns in criminal behavior. GIS allows crime analysis to identify crime hotspots, support investigative leads, and help allocate law enforcement resources more efficiently. The document also outlines how GIS aids tactical crime analysis and criminal investigations through geographic profiling. Finally, it notes that GIS is increasingly important for military applications by helping commanders understand terrain influences on operations.
This document summarizes a study that used data mining techniques to predict crime using real-world crime datasets from Denver and Los Angeles. The goals were to identify crime hotspots and predict future crime types based on location, time, and other attributes. The models tested included the Apriori algorithm to identify frequent crime patterns, a naïve Bayesian classifier to predict crime type based on location/time features, and a decision tree classifier. Key results identified crime hotspots and showed the Bayesian classifier achieved prediction accuracies of 51-54% while the decision tree was more complex and achieved lower accuracy.
This document outlines a project to analyze crime and census data in London. It describes a multi-phase approach including: 1) loading and visualizing crime data, 2) adding census data to the model and performing clustering and regression analysis, and 3) using the results to inform data mining. Key analysis techniques include k-means clustering of census variables to categorize areas, linear regression of census factors on crime types, and decision tree analysis using both crime and census data. The goal is to understand how socioeconomic factors relate to crime levels and types in different parts of London.
This document discusses various spatial analysis techniques for predicting the next location of offenses in a crime series, including standard deviation rectangles and ellipses, convex hull polygons, correlated walk analysis, and analyzing distance between hits, target locations, and journey to crime data. It provides examples of analyses of past crime series where these techniques successfully predicted over 50% of next hits. The document advocates combining multiple analytical methods and data sources to refine location predictions.
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...Zakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
The document discusses a proposed system for detecting ranking fraud for mobile apps. It begins by describing existing ranking fraud and some current detection systems. It then outlines the proposed system which first identifies "leading sessions" in an app's historical ranking data that indicate periods of popularity. It then detects fraud by analyzing ranking, rating, and review behaviors during these sessions using statistical tests. Finally, the proposed system aggregates all the evidence to evaluate sessions for fraud and was tested on real app store data.
- The document proposes a machine learning project using the Chicago Crime dataset to build a web application providing insights into crime patterns.
- It will include geospatial analysis and visualizations of crime hotspots and trends over time using ArcGIS maps, as well as statistical analysis and prediction of future crimes.
- The project involves preprocessing the large dataset, performing feature engineering, dividing Chicago into crime clusters, and building prediction models for each cluster to be deployed via REST API and integrated into the web application. Tools include Python, Docker, Azure ML, ArcGIS, and Java for the frontend.
Cloud Technologies providing Complete Solution for all
AcademicProjects Final Year/Semester Student Projects
For More Details,
Contact:
Mobile:- +91 8121953811,
whatsapp:- +91 8522991105,
Office:- 040-66411811
Email ID: cloudtechnologiesprojects@gmail.com
Crime rate analysis using k nn in python
Crime Analysis & Prediction System is a system to analyze & detect crime hotspots & predict crime.
It collects data from various data sources - crime data from OpenData sites, US census data, social media, traffic & weather data etc.
It leverages Microsoft's Azure Cloud and on premise technologies for back-end processing & desktop based visualization tools.
The Pennsylvania State Police chose ABM's Prophecy incident mapping and predictive analysis tool as part of its Records Management System solution in 2001. The implementation of Prophecy began in May 2003 and was completed within 60 days. During the first eight months of use, fatalities decreased by 7.4% and criminal offenses decreased by 1.3%. Prophecy has given the PSP improved analytical capabilities to more efficiently target resources.
Machine Learning Approaches for Crime Pattern DetectionAPNIC
This document discusses machine learning approaches for predicting crime patterns. It begins by stating the large number of violent crimes in the US and explaining that predicting crimes can help avoid them and ensure better resource allocation. It then discusses existing crime prediction systems like PredPol and the general crime prediction process of data gathering, classification/clustering, and prediction. It provides various methods for data gathering, like crime records, social media, IoT devices, and newspapers. It also discusses clustering algorithms like k-means that can be used. Finally, it notes that PredPol has achieved a 22.7% reduction in crimes in one area, but that combining additional techniques like machine learning, big data analysis, and image processing could further improve crime prediction.
Propose Data Mining AR-GA Model to Advance Crime analysisIOSR Journals
This document proposes a data mining model to advance crime analysis using association rule (AR) and genetic algorithm (GA). The model has three correlated dimensions: a crime dataset, criminal dataset, and geo-crime dataset. AR will be applied to each dataset separately to extract patterns, then GA will be used to mix the resulting ARs and exploit relationships across the three dimensions. This is intended to help detect universal crime patterns and speed up the crime solving process. The model was applied to real crime data from a sheriff's office and validated. Privacy-preserving techniques are also suggested to hide sensitive rules from appearing in the results.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Database and Analytics Programming - Project reportsarthakkhare3
The document summarizes research conducted on crime data from New York City in 2018. The researchers collected data on complaints, arrests, court summons, and prison admissions. They analyzed relationships between these datasets and performed visualizations. Key findings include: the number of complaints exceeded arrests and further declined on the paths to court summons and prison admissions; the top crimes differed between complaints/arrests and court summons/prison admissions; males and those aged 25-44 committed most crimes; Bronx and Manhattan had higher crime rates per capita than other boroughs. The research was limited to one year, and additional analysis could provide more insights into factors affecting proportions and more accurate crime prediction.
This document analyzes crime data from Georgia Tech and Atlanta. It summarizes time series analyses of GT crime data from 2010-2014 which found higher crime, especially larceny, in September when students return for fall semester. Clustering analysis of Atlanta crime data from 2012-2013 identified hot spots of crime and partitioned the city into 6 crime clusters. The analysis also found relationships between aggravated assault, auto theft, burglary, and robbery crimes, with assault often peaking before other crimes seasonally. Future research directions are proposed.
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
As per studies conducted by the University of California, it is observed that crime in any area follows the same pattern as that of earthquake aftershocks. It is difficult to predict an earthquake, but once it happens the aftershocks following it are quite predictable. Same is true for the crimes happening in a geographical area.
This document discusses using data mining techniques like clustering to detect crime patterns from crime data. It proposes using a k-means clustering algorithm with attribute weighting to group similar crimes. Testing on real crime data from a sheriff's office, it was able to identify crime patterns that detectives could validate matched actual crime sprees. The method provides an automated way to detect patterns and help detectives solve crimes faster by focusing on clustered groups of related incidents.
Crime Pattern Detection using K-Means ClusteringReuben George
Crime pattern detection uses data mining techniques like clustering to analyze crime data and identify patterns. This involves plotting past crimes geographically, clustering similar crimes to detect sprees, and analyzing the results to draw conclusions. It helps improve crime solving by learning from history and preempting future crimes. The method augments detectives' work but has limitations like relying on data quality. Overall, crime pattern detection aids operational efficiency and enhancing resolution rates by optimizing resource deployment based on observed crime trends.
GIS aids crime analysis by identifying patterns and trends, supporting intelligence-led policing strategies, and integrating diverse data sources. It enhances crime analysis by highlighting suspicious incidents, supporting cross-jurisdictional pattern analysis, and educating the public. GIS provides tools to capture crime series, forecast crime, and optimize resource allocation to reduce crime and disorder.
This document discusses the application of geographic information systems (GIS) in criminology and defense intelligence. It provides examples of how GIS has been used to map crime rates and identify spatial patterns in criminal behavior. GIS allows crime analysis to identify crime hotspots, support investigative leads, and help allocate law enforcement resources more efficiently. The document also outlines how GIS aids tactical crime analysis and criminal investigations through geographic profiling. Finally, it notes that GIS is increasingly important for military applications by helping commanders understand terrain influences on operations.
This document summarizes a study that used data mining techniques to predict crime using real-world crime datasets from Denver and Los Angeles. The goals were to identify crime hotspots and predict future crime types based on location, time, and other attributes. The models tested included the Apriori algorithm to identify frequent crime patterns, a naïve Bayesian classifier to predict crime type based on location/time features, and a decision tree classifier. Key results identified crime hotspots and showed the Bayesian classifier achieved prediction accuracies of 51-54% while the decision tree was more complex and achieved lower accuracy.
This document outlines a project to analyze crime and census data in London. It describes a multi-phase approach including: 1) loading and visualizing crime data, 2) adding census data to the model and performing clustering and regression analysis, and 3) using the results to inform data mining. Key analysis techniques include k-means clustering of census variables to categorize areas, linear regression of census factors on crime types, and decision tree analysis using both crime and census data. The goal is to understand how socioeconomic factors relate to crime levels and types in different parts of London.
This document discusses various spatial analysis techniques for predicting the next location of offenses in a crime series, including standard deviation rectangles and ellipses, convex hull polygons, correlated walk analysis, and analyzing distance between hits, target locations, and journey to crime data. It provides examples of analyses of past crime series where these techniques successfully predicted over 50% of next hits. The document advocates combining multiple analytical methods and data sources to refine location predictions.
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...Zakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
The document discusses a proposed system for detecting ranking fraud for mobile apps. It begins by describing existing ranking fraud and some current detection systems. It then outlines the proposed system which first identifies "leading sessions" in an app's historical ranking data that indicate periods of popularity. It then detects fraud by analyzing ranking, rating, and review behaviors during these sessions using statistical tests. Finally, the proposed system aggregates all the evidence to evaluate sessions for fraud and was tested on real app store data.
- The document proposes a machine learning project using the Chicago Crime dataset to build a web application providing insights into crime patterns.
- It will include geospatial analysis and visualizations of crime hotspots and trends over time using ArcGIS maps, as well as statistical analysis and prediction of future crimes.
- The project involves preprocessing the large dataset, performing feature engineering, dividing Chicago into crime clusters, and building prediction models for each cluster to be deployed via REST API and integrated into the web application. Tools include Python, Docker, Azure ML, ArcGIS, and Java for the frontend.
Cloud Technologies providing Complete Solution for all
AcademicProjects Final Year/Semester Student Projects
For More Details,
Contact:
Mobile:- +91 8121953811,
whatsapp:- +91 8522991105,
Office:- 040-66411811
Email ID: cloudtechnologiesprojects@gmail.com
Crime rate analysis using k nn in python
Crime Analysis & Prediction System is a system to analyze & detect crime hotspots & predict crime.
It collects data from various data sources - crime data from OpenData sites, US census data, social media, traffic & weather data etc.
It leverages Microsoft's Azure Cloud and on premise technologies for back-end processing & desktop based visualization tools.
The Pennsylvania State Police chose ABM's Prophecy incident mapping and predictive analysis tool as part of its Records Management System solution in 2001. The implementation of Prophecy began in May 2003 and was completed within 60 days. During the first eight months of use, fatalities decreased by 7.4% and criminal offenses decreased by 1.3%. Prophecy has given the PSP improved analytical capabilities to more efficiently target resources.
Machine Learning Approaches for Crime Pattern DetectionAPNIC
This document discusses machine learning approaches for predicting crime patterns. It begins by stating the large number of violent crimes in the US and explaining that predicting crimes can help avoid them and ensure better resource allocation. It then discusses existing crime prediction systems like PredPol and the general crime prediction process of data gathering, classification/clustering, and prediction. It provides various methods for data gathering, like crime records, social media, IoT devices, and newspapers. It also discusses clustering algorithms like k-means that can be used. Finally, it notes that PredPol has achieved a 22.7% reduction in crimes in one area, but that combining additional techniques like machine learning, big data analysis, and image processing could further improve crime prediction.
Propose Data Mining AR-GA Model to Advance Crime analysisIOSR Journals
This document proposes a data mining model to advance crime analysis using association rule (AR) and genetic algorithm (GA). The model has three correlated dimensions: a crime dataset, criminal dataset, and geo-crime dataset. AR will be applied to each dataset separately to extract patterns, then GA will be used to mix the resulting ARs and exploit relationships across the three dimensions. This is intended to help detect universal crime patterns and speed up the crime solving process. The model was applied to real crime data from a sheriff's office and validated. Privacy-preserving techniques are also suggested to hide sensitive rules from appearing in the results.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Database and Analytics Programming - Project reportsarthakkhare3
The document summarizes research conducted on crime data from New York City in 2018. The researchers collected data on complaints, arrests, court summons, and prison admissions. They analyzed relationships between these datasets and performed visualizations. Key findings include: the number of complaints exceeded arrests and further declined on the paths to court summons and prison admissions; the top crimes differed between complaints/arrests and court summons/prison admissions; males and those aged 25-44 committed most crimes; Bronx and Manhattan had higher crime rates per capita than other boroughs. The research was limited to one year, and additional analysis could provide more insights into factors affecting proportions and more accurate crime prediction.
This document analyzes crime data from Georgia Tech and Atlanta. It summarizes time series analyses of GT crime data from 2010-2014 which found higher crime, especially larceny, in September when students return for fall semester. Clustering analysis of Atlanta crime data from 2012-2013 identified hot spots of crime and partitioned the city into 6 crime clusters. The analysis also found relationships between aggravated assault, auto theft, burglary, and robbery crimes, with assault often peaking before other crimes seasonally. Future research directions are proposed.
Crime Data Analysis and Prediction for city of Los AngelesHeta Parekh
This document analyzes crime data from Los Angeles from 2010-2020 to identify trends, predict future crime rates, and make recommendations to law enforcement. Key findings include:
- Crime rates have generally declined over the past decade but dropped significantly in 2020 due to the pandemic.
- Robbery, burglary, and vandalism are the most common crimes.
- Areas with lower median household incomes tend to have higher crime rates.
- Females are consistently the most impacted victims of crime over the past 10 years.
- Southwest LA and other areas have been identified as "hot spots" for criminal activity.
Predictive analysis indicates crime rates will continue increasing post-lockdown in
Mr. Friend is acrime analystwith the SantaCruz, Califo.docxaudeleypearl
Mr. Friend is a
crime analyst
with the Santa
Cruz, California,
Police
Department.
Predictive Policing: Using Technology to Reduce Crime
By Zach Friend, M.P.P.
4/9/2013
Nationwide law enforcement agencies face the problem
of doing more with less. Departments slash budgets
and implement furloughs, while management struggles
to meet the public safety needs of the community. The
Santa Cruz, California, Police Department handles the
same issues with increasing property crimes and
service calls and diminishing staff. Unable to hire more
officers, the department searched for a nontraditional
solution.
In late 2010 researchers published a paper that the
department believed might hold the answer. They
proposed that it was possible to predict certain crimes,
much like scientists forecast earthquake aftershocks.
An “aftercrime” often follows an initial crime. The time and location of previous criminal activity helps to
determine future offenses. These researchers developed an algorithm (mathematical procedure) that
calculates future crime locations.1
Equalizing Resources
The Santa Cruz Police Department has 94 sworn officers and serves a population of 60,000. A
university, amusement park, and beach push the seasonal population to 150,000. Department personnel
contacted a Santa Clara University professor to apply the algorithm, hoping that leveraging technology
would improve their efforts. The police chief indicated that the department could not hire more officers.
He felt that the program could allocate dwindling resources more efficiently.
Santa Cruz police envisioned deploying officers by shift to the most targeted locations in the city. The
predictive policing model helped to alert officers to targeted locations in real time, a significant
improvement over traditional tactics.
Making it Work
The algorithm is a culmination of anthropological and criminological behavior research. It uses complex
mathematics to estimate crime and predict future hot spots. Researchers based these studies on
In Depth
Featured Articles
- IAFIS Identifies Suspect from 1978 Murder Case
- Predictive Policing: Using Technology to Reduce
Crime
- Legal Digest Part 1 - Part 2
Search Warrant Execution: When Does Detention Rise to
Custody?
- Perspective
Public Safety Consolidation: Does it Make Sense?
- Leadership Spotlight
Leadership Lessons from Home
Archive
- Web and Print
Departments
- Bulletin Notes - Bulletin Honors
- ViCAP Alerts - Unusual Weapons
- Bulletin Reports
Topics in the News
See previous LEB content on:
- Hostage Situations - Crisis Management
- School Violence - Psychopathy
About LEB
- History - Author Guidelines (pdf)
- Editorial Staff - Editorial Release Form (pdf)
Patch Call
Known locally as the
“Gateway to the Summit,”
which references the city’s
proximity to the Bechtel Family
National Scout Reserve. More
The patch of the Miamisburg,
Ohio, Police Department
prominently displays the city
seal surroun.
Mr. Friend is acrime analystwith the SantaCruz, Califo.docxroushhsiu
Mr. Friend is a
crime analyst
with the Santa
Cruz, California,
Police
Department.
Predictive Policing: Using Technology to Reduce Crime
By Zach Friend, M.P.P.
4/9/2013
Nationwide law enforcement agencies face the problem
of doing more with less. Departments slash budgets
and implement furloughs, while management struggles
to meet the public safety needs of the community. The
Santa Cruz, California, Police Department handles the
same issues with increasing property crimes and
service calls and diminishing staff. Unable to hire more
officers, the department searched for a nontraditional
solution.
In late 2010 researchers published a paper that the
department believed might hold the answer. They
proposed that it was possible to predict certain crimes,
much like scientists forecast earthquake aftershocks.
An “aftercrime” often follows an initial crime. The time and location of previous criminal activity helps to
determine future offenses. These researchers developed an algorithm (mathematical procedure) that
calculates future crime locations.1
Equalizing Resources
The Santa Cruz Police Department has 94 sworn officers and serves a population of 60,000. A
university, amusement park, and beach push the seasonal population to 150,000. Department personnel
contacted a Santa Clara University professor to apply the algorithm, hoping that leveraging technology
would improve their efforts. The police chief indicated that the department could not hire more officers.
He felt that the program could allocate dwindling resources more efficiently.
Santa Cruz police envisioned deploying officers by shift to the most targeted locations in the city. The
predictive policing model helped to alert officers to targeted locations in real time, a significant
improvement over traditional tactics.
Making it Work
The algorithm is a culmination of anthropological and criminological behavior research. It uses complex
mathematics to estimate crime and predict future hot spots. Researchers based these studies on
In Depth
Featured Articles
- IAFIS Identifies Suspect from 1978 Murder Case
- Predictive Policing: Using Technology to Reduce
Crime
- Legal Digest Part 1 - Part 2
Search Warrant Execution: When Does Detention Rise to
Custody?
- Perspective
Public Safety Consolidation: Does it Make Sense?
- Leadership Spotlight
Leadership Lessons from Home
Archive
- Web and Print
Departments
- Bulletin Notes - Bulletin Honors
- ViCAP Alerts - Unusual Weapons
- Bulletin Reports
Topics in the News
See previous LEB content on:
- Hostage Situations - Crisis Management
- School Violence - Psychopathy
About LEB
- History - Author Guidelines (pdf)
- Editorial Staff - Editorial Release Form (pdf)
Patch Call
Known locally as the
“Gateway to the Summit,”
which references the city’s
proximity to the Bechtel Family
National Scout Reserve. More
The patch of the Miamisburg,
Ohio, Police Department
prominently displays the city
seal surroun ...
Student #1 I have chosen to write about the history of data anal.docxjohniemcm5zt
Student #1
I have chosen to write about the history of data analysis for the Los Angeles Police Department. While I currently reside in Colorado Springs, Colorado and work as a deputy sheriff in Denver, Colorado I grew up in the greater Los Angeles area and I know that they should have a large amount of data to draw from.
Currently the Los Angeles Police Department uses COMPSTAT to compile their data. They have a unit, known as the COMPSTAT unit, whose sole job is to compile crime statistics and analyze the data (Los Angeles Police Department, 2016) COMPSTAT is short for computer statistics. COMPSTAT was developed by Police Commissioner William Bratton in 1994 for use by the New York Police Department. According to the University of Maryland by the year 2000 over a third of police agencies with over 100 officers were utilizing some sort of COMPSTAT like program (University of Maryland, 2015). In 2002 William Bratton became the Chief of Police for the Los Angeles Police Department and brought with him the concept of COMPSTAT. During the first six years of his tenure Los Angeles saw a steady decrease in the cities crime rates thanks largely in part to COMPSTAT policing.
Mean, mode and median play a large part in analyzing criminal data. The mean is the average number. An example of this for crime data analysis would be in neighborhood C there was 14 robberies committed on Monday between 1 and 3 AM, 17 robberies on Tuesday at the same time period and 9 on Wednesday during the same time period. The mean would be 13.3 robberies per night for those 3 nights. Knowing this is high for the city the data could be used to justify extra police presence in Neighborhood C. An example of the mode would be if in the same neighborhood in the same week there were 17 robberies on both Friday and Saturday, 12 on Thursday and 11 on Sunday. The mode would be 17 and it would also be a reason to add extra police presence in the neighborhood until a significant decrease was seen in the amount of robberies taking place. Finally we come to the median. This is simply line the numbers up for the week and take the number that falls in the middle. In the case of the robberies occurring in neighborhood C the number would be 14. All of this data can be combined to show watch commanders and captain’s areas where they should be focusing their officer’s time. If there is a neighborhood that has seen only one or two robberies during the week, it is definitely not in as much need of a heavy police presence as Neighborhood C is.
Student #2
Beginning in the mid-1990’s, police in New York began to run statistical analysis of the city’s crime reports, arrests and other police activity known as COMPSTAT. Law enforcement agencies since this analysis began, has implemented their own data-driven approaches to tracking and adapting to crime trends. The LAPD is both heavily armed and thoroughly computerized. The Real-Time Analysis and Critical Response Division is its central processor..
This paper focuses on finding spatial and temporal criminal hotspots. It analyses two different real-world crimes datasets for Denver, CO and Los Angeles, CA and provides a comparison between the two datasets through a statistical analysis supported by several graphs. Then, it clarifies how we conducted Apriori algorithm to produce interesting frequent patterns for criminal hotspots. In addition, the paper shows how we used Decision Tree classifier and Naïve Bayesian classifier in order to predict potential crime types. To further analyse crimes’ datasets, the paper introduces an analysis study by combining our findings of Denver crimes’ dataset with its demographics information in order to capture the factors that might affect the safety of neighborhoods. The results of this solution could be used to raise people’s awareness regarding the dangerous locations and to help agencies to predict future crimes in a specific location within
a particular time.
This document discusses using machine learning techniques like clustering and decision trees to analyze crime data from Chicago between 2014-2016. It aims to identify crime hot spots and patterns to help police allocate resources more efficiently. The document applies k-means clustering to crime data grouped by location and type, identifying a "vice" cluster with crimes like prostitution and drugs in two adjacent wards. It suggests police could use temporal and hourly crime patterns from the analysis to optimize staff scheduling and deployment. The document also discusses using decision trees and k-nearest neighbors algorithms on the crime data supplemented with temperature and unemployment data to further explore crime patterns.
IRJET- Detecting Criminal Method using Data MiningIRJET Journal
The document discusses using data mining techniques like clustering algorithms to help detect criminal methods and patterns from crime data. It proposes applying a weighted k-means clustering approach to group similar crimes together based on important attributes. This would help identify potential criminal patterns and present them to detectives in a geospatial plot, highlighting crime clusters. The results were checked against court case outcomes and some criminal patterns were confirmed. The authors conclude the method helps detectives by organizing large crime datasets but requires close collaboration and domain knowledge to effectively map real crime data for mining.
Abstract : Crime prediction is a topic of significant research across the fields of criminology, data mining, city planning, law enforcement, and political science. Crime patterns exist on a spatial level; these patterns can be grouped geographically by physical location, and analyzed contextually based on the region
in which crime occurs. This paper proposes a mechanism to parameterize street-level crime, localize crime hotspots, identify correlations between spatiotemporal crime patterns and social trends, and analyze the resulting data for the purposes of knowledge discovery and anomaly detection. The subject of this study is the county of Merseyside in the United Kingdom, over a span of 21 months beginning in December 2010 (monthly) through August 2012. Several types of crime are analyzed in this dataset, including Burglary and Antisocial Behavior. Through this analysis, several interesting findings are drawn about crime in Merseyside, including: hotspots with steadily increasing crime levels, hotspots with unstable crime levels, synchronous changes in crime trends throughout Merseyside as a whole, individual months in which certain hotspots behaved anomalously, and a strong correlation between crime hotspot locations and borough/postal code locations. We believe that this type of statistical and correlative analysis of crime patterns will help law enforcement agencies predict criminal activity, allocate resources, and promote community awareness to reduce overall crime rates.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Journal of-Criminal Justice, Vol. 7, pp. 217-241 (1979). Per.docxpriestmanmable
Journal of-Criminal Justice, Vol. 7, pp. 217-241 (1979).
Pergamon Press. Printed in U.S.A.
0047-2352’79/030217-26s2.WO
Copyright @ 1979 Pergamon Press Ltd
INFORMATION, APPREHENSION, AND DETERRENCE:
EXPLORING THE LIMITS OF POLICE PRODUCTIVITY
WESLEY G. SKOGAN
Department of Political Science and Center for Urban Affairs
Northwestern University
Evanston. Illinois 60201
GEORGE E. ANTUNES
Department of Political Science
University of Houston
Houston, Texas 77004
and
Workshop in Political Theory and Policy Analysis
Indiana University
Bloomington, Indiana 47401
ABSTRACT
The capacity of police departments to solve crimes and
apprehend offenders is low for many types of crime, particu-
larly crimes of profit. This article reviews a variety of studies
of police apprehension and hypothesizes that an important
determinant of the ability of the police to apprehend crimi-
nals is information. The complete absence of information for
many types of crime places fairly clear upper bounds on the
ability of the police to effect solutions.
To discover whether these boundaries are high or low we
analyzed data from the 1973 National Crime Panel about the
types and amount of information potentially available to po-
lice through victim reports and patrol activities. The evidence
suggests that if the police rely on information made readily
217
218 WESLEY G. SKOGAN and GEORGE E. ANTUNES
available to them, they will never do much better than they
are doing now. On the other hand, there appears to be more
information available to bystanders and passing patrols than
currently is being used, which suggests that surveillance
strategies and improved police methods for eliciting, record-
ing, and analyzing information supplied by victims and wit-
nesses might increase the probability of solving crimes and
making arrests. In light of this we review a few possibly help-
ful innovations suggested in the literature on police produc-
tivity and procedure.
Some characteristics of the crime itself, or of events surrounding the crime, that are
beyond the control of investigators, determine whether it will be cleared in most in-
stances. (Greenwood et al., 1975: 65)
There is no feasible way to solve most crimes except by securing the cooperation of
citizens to link a person to the crime. (Reiss, 1971: 105)
INTRODUCTION
A recent spate of studies of crime and the deterrent effectiveness of the criminal
justice system has raised anew a question as old as Bentham: Does raising the cost of
criminal activity signiticantly reduce the level of crime in a community? In these studies,
the cost of criminal activity has been conceptualized in two ways: as the loss of time and
opportunity attendant to apprehension (measured by the certainty of arrest or punish-
ment), and as the stigma, discomfort, and loss of opportunity that come with conviction
by the courts (measured by the severity of punishment). Indicators of the di ...
The document discusses various mechanisms of accountability for police departments, including internal mechanisms like COMPSTAT and external mechanisms like civilian review boards and courts. COMPSTAT is an internal accountability method that uses data analysis of crime statistics to hold police commanders accountable for addressing problems in their jurisdictions. It promotes focused problem-solving in police meetings. External accountability comes from civilian oversight like review boards and courts establishing police misconduct standards.
The document discusses the challenges facing law enforcement in addressing cybercrime. It summarizes the results of a survey of 185 analysts from UK law enforcement organizations. The analysts believe that the amount of time they spend on cybercrime will triple in the next three years, but only 30% believe they currently have the necessary skills and tools to address cybercrime effectively. The document calls cybercrime a "tipping point" and makes recommendations for how law enforcement can improve its ability to investigate cybercrime through collaborative approaches, new digital tools and training, and focusing on intelligence to enable operational outcomes.
IRJET- Crime Analysis using Data Mining and Data AnalyticsIRJET Journal
This document discusses using data mining and analytics techniques to analyze crime data and predict crime rates. It proposes using linear regression on crime data from the Indian government to predict future crime occurrences and identify high-risk regions. The system would analyze factors like crime type, offender age, month, and year to build a regression model. This model could then predict crime rates and indicate whether a region is high or low risk for criminal activity. Graphs and tables would visualize the predictions to help law enforcement allocate resources. The goal is to help reduce crime and increase public safety by identifying patterns in historical crime data.
Crime Analysis based on Historical and Transportation DataValerii Klymchuk
Contains experimental results based on real crime data from an urban city. Our set of statistics reveals seasonality in crime patterns to accompany predictive machine learning models assessing the risks of crime. Moreover, this work provides a discussion on implementation, design for a prototype of cloud based crime analytics dashboard.
This document discusses problem-oriented policing and the SARA model. SARA is an acronym that stands for scanning, analysis, response, and assessment, which are the four steps used to identify, analyze, and select problems. The document examines a problem-oriented policing guide about liquor store robberies. It describes factors that can contribute to liquor store robberies, such as cash transactions and lone employees. The guide offers suggestions for responses like improving lighting and visibility to address the problem.
The document discusses the increasing role of technology in law enforcement. It describes how predictive policing uses data analysis to predict crime hotspots, and how camera surveillance has helped reduce crime in some areas. While technology provides benefits like helping solve crimes, it also raises challenges regarding privacy, data storage and security, and costs. As technology advances, the debate around its use in policing will continue between those who emphasize its benefits and those concerned about privacy issues.
This document proposes a method for clustering related Chinese text cases using a combination of Fuzzy K-Means (FKM) clustering and Canopy clustering algorithms. It first discusses factors that can indicate related cases such as location, time, evidence. It then describes representing Chinese text cases as vectors using word segmentation and reducing dimensions. FKM clustering is applied to group related cases but has limitations. Canopy clustering is then used to estimate the optimal number of clusters before applying FKM to overcome limitations and improve efficiency. While this method addresses challenges of clustering Chinese cases, results may not be very accurate and the method could be improved.
1. Analysis on crimes in Atlanta
Undergraduate Research Team
Georgia Institute of Technology
ISYE 4699
December 7 2014
1 Abstract
In this report, we will include what conclusions we made using the Geor-
gia Tech and Atlanta crime data. Areas of studies we were interested in
were patrol analysis, hot spots, and correlations among crimes. Using patrol
analysis, we studied about how current patrol routes could be changed. Im-
provements would result in less arrival times by police officers, lower crime
rates, less economic waste, and many more. A hot spot is an area with
concentrated crimes. We developed an algorithm on spotting hot spots in
Atlanta. This program needs improvements, but once we upgrade it to locate
more accurate hot spots, we will be able to compare hot spots of types of
crimes and find overlaps among them. Finally, we studied about how one
crime led to other crimes. We focused on finding the relationship between
auto theft and burglary.
2 Patrol Analysis
After we received the data from the Georgia Tech Police Department, we
filtered out unnecessary variables to conduct our preliminary research. We
were left with 18 most useful variables, and our research involved using these
information to come up with an improved solution. The variables are listed
below for reference:
1
2. 2.1 Overview ISYE 4699
• OCANumber
• IncidentFromDate
• IncidentFromTime
• IncidentToDate
• IncidentToTime
• OffenseCode
• Offense Description
• CaseStatus CaseDisposition
• LocationCode
• PatrolZone
• Location
• Landmark
• LocationStreetNumber
• LocationDirectional
• LocationStreet
• LocationLatitude
• LocationLongitude
• CreatedSource
2.1 Overview
A total of 11578 crimes were recorded in Georgia Tech and nearby regions.
By looking at the distribution of time, day, and month, we acknowledged that
crimes occurred most often at around 1 am, and the frequency gradually
dropped until 6 am, when the crime was the least likely to occur. The crime
rate fluctuated smoothly between 400 to 700 crimes per hour from 8 am to
11 pm. April and September, which are parts of Spring and Fall Semesters,
had the highest number of crimes during the year. Offense codes 2700, 3657,
and 2751 topped the list of crimes. Approximately three-fourths of cases
were closed or cleared, and even the remaining cases were mostly inactive.
From the analysis of the four patrol zones divided by the Georgia Tech police,
Zone 2 was found to be the most dangerous. The number of crimes there
was almost twice compared to any other zone. Detailed crime type analysis
will be mentioned later.
2.2 Urban Police Patrol Model
Initially, the police department calculated patrol efficiency only consid-
ering the patrol time, and it even contained many errors. It was computed
by taking the difference of the total work time of police officers and the time
they spent on other duties, such as answering radio calls, taking care of the
traffic, or having a meal. This calculation gave inaccurate efficiency values
2/22
3. 2.3 Further Questions ISYE 4699
since many police officers patrolled at a particular period of time when the
crimes were mostly likely to happen. Therefore, there was a lack of police
officers to patrol in other times. The goal was to have the right number of
patrol officers.
To fix these problems, in 1960’s, Dr. Richard Larson developed a system-
atic approach to study police patrol efficiency. He cooperated with the NYC
police department to develop the first version of the Urban Police Patrol
Model. He used the same 18 variables we listed above and came up with the
most ”accurate” model. The key idea of Dr. Larson’s model was as follows:
Given the pattern of crimes and limited amount of preventive patrol, how
should the effort be allocated along the streets to best achieve highest effi-
ciency? We were influenced by his study and decided to take on his model to
analyze the Atlanta crime data and optimize the patrol resource allocation.
Figure 1: Patrol Time = Total Time − Time for other duties
2.3 Further Questions
Dr. Larson’s model gave a rough approximation of the behavior of po-
lice preventive patrol. Qualitatively, Koopman’s method suggested that the
3/22
4. ISYE 4699
patrol effort should grow as the logarithm of the crime density increased
and further advised that areas with low likelihood of crimes should not be
patrolled at all. Refinement of this model was required before it could be
implemented by the police and a few more questions were asked by Dr. Lar-
son:
1. To what extent is an optimal patrol coverage function realizable?
2. How closely does a unit have to approach the optimal coverage in order
to achieve satisfactory result? Or, equivalently, what is the sensitivity
of the solution about the optimum?
3. To what extent is the crime distribution modified by patrol strategies?
4. How should each crime type be evaluated to reflect its relative serious-
ness?
5. What is the conditional probability that a crime will be detected given
its pattern?
We believe that these questions contain valuable insights and will continue
our research to answer questions in these areas.
3 Georgia Tech Crime data
One interesting phenomena was that while the crime rate of Atlanta kept
decreasing at a rate of 5.324%, the crime rate of Georgia Tech fluctuated
over the years. As the graphs suggested, the crime rate was higher in 2011
compared to 2010, then it decreased in 2012 and reached at its peak in
2013. We were unsure if the crime rates actually changed, or if the change
of benchmark on identification of crimes at Georgia Tech was the reason for
this. For example, some crimes in 2010 were categorized differently in later
years.
4/22
5. ISYE 4699
2010 2011 2012 2013 2014
0
1,000
2,000
Year
Numberofcrimes
Figure 2: Georgia Tech Crime – fluctuates
2010 2011 2012 2013 2014
0
1
2
3
·104
Year
Numberofcrimes
Figure 3: Atlanta Crime – decreases
We also observed the crime patterns using time series. In order to find the
relationship between Atlanta crimes and Georgia Tech crimes, we compared
the annual data and realized that the crime number patterns did not have a
recognizable similarity. In fact, there were some notable difference in their
patterns.
5/22
6. ISYE 4699
Figure 4: Montly number of crimes for Georgia Tech and Atlatna
Compared to Georiga Tech that had fewer crime in the summer, Atlanta
had even more. It was obvious that Georgia Tech’s summer crime rate was
low because most students left the campus for vacation. However, we did
not have an easy explanation for why the Atlanta data had an increase in
summer. Even when we took the average rate of years and put the graph
for GT and Atlanta together, we could see that Georgia Tech was more
dangerous during the semesters, but it was the opposite for Atlanta. Also,
it was interesting that the crime rate in Georgia Tech generally decreased
in between late August and December. We tried to come up with a few
reasons for the phenomenon. First, most freshmen came in at the end of
August of every year, and they lacked the sense of safety, and were thus
much more vulnerable to crimes. Second, September was the pledge month
for fraternities and sororities. Students were asked to do crazy stuff and were
under the risk of being targeted, especially when they were drunk or walked
outside late at night.
6/22
7. 3.1 Geographical Relationship ISYE 4699
Figure 5: Average number of crimes for Georgia Tech and Atlatna
3.1 Geographical Relationship
We analyzed crimes geographically by using offense codes and patrol
zones. This was easily done by making a pivot table and looking at the
results. It showed an overall trend of the data and gave insights on which
other techniques to apply to achieve even better results.
Our objective was to prove or disprove that there exists a clear relevance
between the GTPD patrol zones (Zone 1 - 4) and the offense codes used by
the NCIC. Furthermore, if such process proved to be efficient, then we could
further apply the same procedure to analyze the Atlanta crime data.
We used all of the GTPD data from 2010 to 2014. To neglect unnecessary
information, we only took account of two variables: Patrol Zone and Offense
Code. For every crime, both its offense code and its location were given, so
we had enough data for analysis. We programmed Excel to give the output
in the following way: Z1 = [22 : 24, 23 : 325, 29 : 84, . . . ]
• The first two numbers represented the two numbers of the offense code
• The remaining numbers were the number of such incidents
• For example, there were 24 crimes that was coded ”22”
We played around with the NCIC code list before we proceeded with the
test.
• There were many different types of offense on the offense code list, but
we could categorize them nicely based on their first two numbers
7/22
8. 3.1 Geographical Relationship ISYE 4699
• We excluded some offenses from the data because they were student
conducts, public order crimes, juvenile, invalid, or trivial to the overall
data
Using our manipulated data set, we generated a pivot table.
Figure 6: The pivot table (Location versus type of crimes)
Based on the table, we found out that Zone 2 had the most number
of crimes. Particularly, Zone 2 had the most number of assaults, burglary,
damage property, and stolen vehicles compared to other zones. In conclusion,
our approach could have worked better with more data. We will apply this
method on the Atlanta data later since we believe there will be enough data.
However, we concluded that we could not infer more information about the
relationship between location and type of crimes at Georgia Tech.
Crime Type Most frequent (# of Crimes) 2nd
most frequent (# of Crimes)
Assault Zone 2 (94) Zone 3 (24)
Burglary Zone 2 (79) Zone 4 (32)
Damage Property Zone 2 (171) Zone 1 (84)
Stolen Vehicle Zone 2 (43) Zone 1 (27)
Based on this approach, we could conclude that Zone 2 was the most dan-
gerous zone. It was difficult to find a relevance between types of crimes and
patrol zones because Zone 2 had so many more crimes than other zones –
there were not enough information about crimes in other zones. There were
explanations why there could not be enough data. First, the Georgia Tech
8/22
9. 3.2 Questions ISYE 4699
campus was considered safe and did not have many crimes to record. Sec-
ond, many of the recorded crimes were minor, and after we filtered out them,
we only left with a few data. Last, there was not enough variables to take
account. There could have been more significant factors that contribute to
the result.
3.2 Questions
We have come up with some questions that needed to be answered in
order to continue our research. We will list them here:
1. How are the 4 zones divided into? Can we have a detailed description
of where each zone is?
2. There are 4 zones within Georgia Tech, and there are 2 more zones: off
campus and SAV. What does SAV mean?
3. There were many incidents that counted as ”minor” crimes. Are they
really insignificant enough to be excluded from our research, or should
we give more attention to them?
4 Atlanta Crime data
4.1 Time series and Seasonality
Time study of criminal data was helpful in revealing crime patterns on
time scale. With the 2011-2014 crime data, we grouped the entries by date
(occur_date) and crime type (UC2 Literal). We returned the count of each
crime type on every reported date and performed the time series analysis.
9/22
10. 4.1 Time series and Seasonality ISYE 4699
In the time series plot of total crimes each day, we could observe a rough
seasonal pattern. We were unsure if we could detect this seasonal pattern
on all crime types or just on a few that influenced the result on total crime
rate. To figure out, we decomposed the data into different crime types and
performed the time series analysis. There were two notable crime types
that returned interesting patterns: aggravated assault (AGG_ASSULT) and
larceny (LARCENY). Therefore, we decided to investigate more on these
types of crimes. Below are the time series plots for them.
10/22
11. 4.1 Time series and Seasonality ISYE 4699
Moreover, using the additive single exponential method, we were able to
smooth the data and come up with cleaner diagrams:
The smoothened data plots showed us the trend of crime data. The
frequencies of both aggravated assault and larceny tended to peak around in
September, and they slowly dropped down to bottom in March.
Using the Holt-Winter’s method, we were able to apply weight on data
points, and we came up with an applicable model for current data points.
The diagram below shows our result of applying the Holt-Winters’ method
on larceny data. Red points represented the smoothed data points of our
model.
11/22
12. 4.1 Time series and Seasonality ISYE 4699
With the smoothed model, we were able to make a prediction on future
data points. We used this method on the larceny data points and made
a prediction on 100 more data points with a 95% prediction interval. The
residual plots are shown below.
12/22
13. 4.1 Time series and Seasonality ISYE 4699
In the following diagram, blue points represented the actual data, red
points showed the smoothed data points with lower weight on older data
and higher weight on later data, the green points gave the prediction for the
next 100 data points, and purple points were the upper and lower bounds
of 95% prediction interval of green points. The residual analysis plot of this
method was as follows. P-value of the Anderson-Darling Test was 0.009 –
it indicated that the residuals agreed with normality assumption. Residual
versus fits plot showed that the residuals were randomly distributed, and it
supported our identical variance assumption. Hold-Winters’ method had a
mean absolute percentage error (MAPE) of 14.4879, mean absolute deviation
(MAD) of 6.1235, and mean squared deviation (MSD) of 60.0624. These
results were lower than those of single exponential method, which indicated
that the Holt-Winters’ method was an appropriate choice in this time study.
13/22
14. 4.2 Hot spots ISYE 4699
4.2 Hot spots
We began our data analysis by checking whether there were areas of
concentrated crime in Atlanta. In order to locate these areas, called ”hot
spots,” we used four basic statistical tests: mean center, standard deviation,
standard deviation ellipses, and the test for clustering. The mean center
gave us the mean longitude and latitude of crimes, the standard deviation
showed how deviated the crimes were with respect to the mean center, and
the standard deviation ellipses visually showed which crimes were one stan-
dard deviation away from the mean center. Most importantly, the test for
clustering gave information on the closeness of crime locations.
The mean center we found was near the Fulton County Juvenile Court.
We figured that the mean center itself did not give much information about
hot spots. It was not necessarily true that crimes near the mean center
occurred with a high probability; however, it was useful as a comparison.
We could check where other crimes occurred in relation to the mean center.
Also, the result in standard deviation and standard deviation ellipses were
vague. The standard deviation ellipses did not map the concentrated area
of crimes – some areas of an ellipse had frequent occurrence of crimes, while
14/22
15. 4.2 Hot spots ISYE 4699
other areas within the same ellipse did not have many crimes. On the other
hand, the values obtained from the test for clustering were relative, and thus
were comparable. Therefore, we concluded that the test for clustering gave
the most accurate representation of hot spots among the four tests.
To test for clustering, we used the nearest neighbor index method. Simply
put, we generated random crime spots in Atlanta and compared how close
those spots were to how close actual crime spots were. The ratio between
the distances among observed data to distances among random data was
called the Nearest Neighbor Index (NNI). The smaller the NNI was, the more
clustered the data was. We could safely assume that data was clustered if
NNI was close to 0.5.
The NNI for all crimes in Atlanta was 0.543. This showed that there was
definitely a correlation between locations and crimes. Then we found NNI for
each types of crime. To minimize error, we calculated NNI several times and
took the average. Table 1 shows NNI for each type of crime. Since no NNI
was less than 1, all crimes were somehow clustered. Note that robbery was
most clustered and murder was least clustered. Except for murder and rape,
all other crimes’ NNI were below 0.5, which implied that it was worthy to
investigate the hotspots. One reason for robbery and theft having the lowest
NNI was their relatively frequent occurrence. The data showed that these
types of crimes appeared more frequently than the others. It was natural
that there were hot spots where victims were more vulnerable to robbery
and theft. On the contrary, since rape and murder took place less frequently
than other crimes, it was not surprising to observe more scatterings of data.
Type of crime NNI
Total 0.543
Assault 0.416
Burglary 0.448
Murder/Homicide 0.823
Rape 0.694
Robbery 0.258
Theft 0.371
Vehicle 0.414
Table 1: NNI for different types of crimes
Some notable regions of hotspots for all types of crimes included the
areas along 10th street NW and along Peachtree street SW. Although not
15/22
16. 4.2 Hot spots ISYE 4699
many crimes occurred inside schools, there were many crimes reported near
colleges, including Georgia Tech, Georgia State University, Clark Atlanta
University, and Spelman College. Since we were specifically interested in the
relationship between robbery and auto theft, we compared the hot spots of
auto theft to the hot spots of robbery. We could observe that there were
some overlaps. We have yet to conduct a statistical test on the correlation
between the two crime types, but this seemed like a notable topic to study,
and we decided to do more research to figure out whether stolen cars were
used to commit other crimes.
Errors in analysis came from crimes not having the same amount of data.
A crime with the most data will most likely produce an accurate NNI, while
a crime with the least data will not be able to produce an accurate NNI.
Another error appeared when generating random crime spots on the map.
We had difficulty setting an exact boundary and instead generated random
points inside a rectangle that approximately resemble the border of Atlanta.
Furthermore, we assumed that the Earth was a 2 dimensional plane and used
the inappropriate formula for finding the distance between points. Instead,
our results would have been improved with a help of the Haversine formula:
d = 2r arcsin sin2
φ2 − φ1
2
+ cos(φ1)cos(φ2)sin2 λ2 − λ1
2
Even though there were many ways to compute more accurate NNI, cur-
rently calculated NNI will be sufficient when comparing clustering of a crime
to other crimes. However, we could develop a better algorithm to compute
the NNI, as the current algorithm computed numerous unnecessary infor-
mations. For instance, it calculated the distance between all points and
compared all values when we could have smartly selected a few points to
compare. We would improve our algorithm to incorporate the Voronoi dia-
gram and Fortune’s algorithm to reduce the computational time. This would
allow us to analyze more data in less time, and we will also be able to calcu-
late multidimensional data more efficiently.
As shown in Figure 1, the Voronoi diagram is a plot with points divided
up by half-planes. Subspaces are divided up such that each subspace con-
tains one point and that an imaginary line segment that connects two near
points are perpendicular to a borderline. Since the points are now somewhat
sorted, this diagram can find the nearest neighbor intelligently and has the
computational time of O(logn). The problem is that generating half-planes
16/22
17. 4.2 Hot spots ISYE 4699
take a long time O(n2
logn), and thus will slow down the process. Luckily, the
Fortune algorithm can find half-planes faster, and the big O of it is O(nlogn).
Therefore, combining two algorithms, we end up with the computation time
of O(nlogn).
(a) Voronoi diagram step 1 (b) Voronoi diagram step 2
(c) Fortune algorithm
Figure 9: Caption place holder
In addition to those improvements, we can also filter out avoidable calcu-
lations by identifying the unstable queries. An unstable query arbitrarily sets
a border around each point so that the algorithm determines which points to
include in its process. Along with the integration of algorithms stated above,
this improvement will further reduce the computational time. Additionally,
the algorithm can be used to find which points are located near a certain
point.
Finally, we will perform more statistical test on the data set. Our focus
will be to reduce errors and computation time as well as to locate zones
that need more attention by the officers. Once we have the algorithm, we
will be able to suggest new patrol routes to minimize the arrival time at the
crime site or the optimized number of officers in each patrol zone. Then by
17/22
18. 4.3 Auto Theft ISYE 4699
comparing with the optimized solution, we can check how efficient current
resource allocation is.
4.3 Auto Theft
When we checked the hot spots, we noticed that the hot spots for auto
theft and for robbery had a lot of overlaps. We were interested in this obser-
vation and decided to test the relationship between auto thefts and robbery.
Then we realized that criminals’ primary goal of auto thefts was not to com-
mit robberies, but rather to sell those cars. If they did not sell cars right
away, however, they used that car to commit other crimes, including joyriding
(driving around freely), drug dealing, or robbing.
One way we used to find the correlation between auto thefts and other
crimes was by tracking a stolen car and checking if it was recorded again as
a suspect’s car. The most obvious way to do so was by comparing the license
plates. However, there was not enough information; many times, it was not
viable for witnesses to remember the license plate numbers. Instead, we
compared other attributes of the stolen vehicles and suspect vehicles. There
were too many information, so we filtered out less important information and
ended up with 60 variables. We reconstructed two data sets using them and
started our research.
One file contained all necessary information about auto theft, such as the
offense code and date of crime. Unfortunately, one fourth of crime data did
not contain any information about the stolen car, and could have been more
helpful with consistency and completeness of documentation. The other file
included the information about suspect vehicles. Here, we listed any vehicle
that was used to commit any type of crime. This data set also had insufficient
amount of data, but we wrote a code to use the best out of these two files.
While examining criminals’ habits, we came up with more questions, such
as the time delay between car stolen time and robbery time, car types that
were vulnerable to crimes, and how the stolen cars were used. In follow-
ing paragraphs, we will provide an analysis of police crime data along with
derived questions.
The easiest way to categorize cars was by their colors and makers, so we
made a color versus maker pivot table. The most noticeable information we
got from the pivot table was that Dodge, Chevrolet, and Ford were the most
popular car types and white, black, and silver were the most vulnerable
colors. Particularly, old models in 1990’s were targeted frequently. These
18/22
19. 4.3 Auto Theft ISYE 4699
results were pretty intuitive, as those cars had weaker security systems and
criminals did not want to get noticed by robbing fancy cars. However, to our
surprise, the thieves showed more interest in luxury cars than we thought
they would do. We found out that the main reason for stealing those cars
despite their difficulty to do so was because those cars could be sold for high
prices.
Dodge Chevrolet Ford Honda
0
20
40
60
80
100
120
140
160
Types of cars
Numberofcars
White
Black
Silver
19/22
20. 4.3 Auto Theft ISYE 4699
Figure 10: The popularities of car types and their years
It was obvious that criminals targeted old, common cars for easy theft
and expensive cars for high return. What was interesting, however, was that
criminals tended to take cars that were less valuable than cars they used
for robbery. In other words, they used newer cars to steal older cars. This
could be interpreted in two ways: they wanted small, easy money, or they
needed a new car to commit a new crime. We had to know what they did
with the stole cars. To do so, we found out how much time criminals spent
before committing a crime with their stolen cars. Out of 5270 auto stolen
offenses and 4237 suspect vehicle cases, we found 48 exact matches. In these
48 cases, the average time a stolen car was spotted in another crime scene
20/22
21. 4.4 Questions and Goals ISYE 4699
was about 4 hours, if we did not count for some cars that reappeared several
days later. Particularly, among the 48 cases, two cars were used to commit
multiple crimes in a short time period. Since the cars were used in crime
only a few hours after they were stolen and then were sold, we could infer
that the reason criminals stole cars was to make their crimes less traceable
and to earn some quick cash.
Suspect
vehicle
year
Suspect vehicle
maker
Stolen
vehicle
year
Stolen vehicle
maker
2002 Chevrolet 1999 Ford
2010 Dodge 2004 Dodge
2011 Toyota 1996 Honda
2001 Ford 1996 Honda
2008 Nissan 1984 Oldsmobilie
Table 2: Examples of suspect stealing less valuable cars
4.4 Questions and Goals
Our goals for next semester will be as follows.
1. Upgrade the model used for Georgia Tech crimes to use for Atlanta
crimes.
2. Develop a better algorithm on locating the hot spots
3. Find geographic matches and correlations among crimes
4. Suggest an optimized way of allocating resources.
5. Recognize crime patterns
To continue with our research, we needed more information about crimes.
We will list some questions that are preventing us from advancing.
1. Atlanta is divided up into 5 zones, but the Excel data shows the place
of crime by latitude and longitude, not by zones. Given the coordinate
of a place, is there a way of telling which zone that place is in?
21/22
22. 4.5 Reference ISYE 4699
2. Has the crime criteria changed over the past few years? In other words,
is there a crime that was considered ”type A” crime but now ”type B?”
Now that we are familiar with the data and gained insights on crimes
in Atlanta, we are certain that our progress will speed up. There was a
limitation on the amount of data; however, we learned to make use of small
data to come up with noteworthy conclusions. We hope that we establish a
generalized algorithm that could be used in many cities.
4.5 Reference
Pictures of Voronoi Diagrams: https://www.youtube.com/watch?v=7eCrHAv6sYY
22/22