ArcGIS Space-Time Mining of Crime Data

Margaret Furr
MINING CRIME INCIDENT DATA FOR SPACE-TIME
PATTERNS AROUND WASHINGTON, DC CAMPUSES
Photo Source: http://www.thecollegefix.com/post/19184/

DATA
• 2012 Crime Incidents; 2013 Crime Incidents; 2014 Crime Incidents
• File type: shapefile, points
• Variables of interest: X coordinates, Y coordinates, Offense, Report Date, Start Date, End Date
• Observations: 109,656
• Projection: Spatial reference = 26985
• Source: Open DC Data
• University and College Campuses; Campus Areas Zoning
• File type: shapefile, polygons
• Variables of interest: Campus names, Campus areas, Campus lengths
• Observations: 8 college/university campus zoning areas,
30 college/universities
• Projection: Spatial reference = 26985
• Source: Open DC Data
• USStates
• File type: shapefile, polygons
• Observations: 1 for each state + 1 for Washington, DC
• Projection: GCS_WGS_1984 GCS
• Source: United States Geodatabase
1Photo Source: http://opendata.dc.gov/

RESEARCH MOTIVATIONS
• Crime -- less likely to occur on campuses
• Washington, DC -- not a city with the most crime
• 2 DC universities - ranked as the most dangerous campuses in some news reports
• Gallaudet University
• Howard University (7 forcible rapes, 90 robberies, 27 aggravated assaults, 160 burglaries, 43 car thefts)
• In general -- a growing concerns about students’ safety on campuses since 1990s
• Understanding crime frequencies and trends across time helps (1) police departments,
(2) administrators, (3) policymakers, (4) journalists and (5) students make decisions
2

RESEARCH MOTIVATIONS
• Researchers have analyzed space-time crime patterns
• None have looked at DC crimes as they relate to campuses’
locations
• I have conducted spatial-regression analyses using Chicago crime data
and R spatial packages
• I have not conducted space-time analyses, using the time series
component of crime data, but I find this component to be an important
one
• I have not analyzed crime data that occurs in the DC area, but this is
where 2 universities are reported to be unsafe
• I have not used ArcGIS tools yet!
3

RESEARCH QUESTIONS
• What is the frequency of each crime type at locations
near campuses, and how do these frequencies compare to
the frequencies of each type far from campuses?
• Looking at the start datetimes of 2012-2014 crime
reports, are there any patterns in when each incident
occurs near campuses?
• Do these patterns reveal emerging trends? If so, how do
trends differ by crime type?
• What trends are emerging near the two most
dangerously ranked DC campuses?
4

RESEARCH APPROACH
• Space-Time Analysis of crime within buffered campus areas
• Space-Time Cube
• Emerging Hot Spot Analysis
5

• Define Projection and Project
• Selection
• Merge
• Buffer
• Frequency Analysis
• Convert Time
• Create Space-Time Cube
• Emerging Hot Spot Analysis
TOOLS
6
Photo Sources: https://desktop.arcgis.com/en/desktop/latest/tools/space-time-pattern-mining-toolbox/learnmorecreatecube.htm
https://desktop.arcgis.com/en/desktop/latest/tools/space-time-pattern-mining-toolbox/learnmoreemerging.htm

GEODATABASE
• FinalProjectMargaretFurr.gdb
• Crime Incident data
• Campus Area data
• DC polygon
7

PROJECTION
• Crime Incident data and Campus Area data had no projection
• Metadata said this data was 26985
• Defined data as NAD83-Maryland, Projection
• Reprojected data to be NAD83 17N, UTM
• DC data had WGS_1985 Coordinates and WGS_1984 datum
• Reprojected data to be NAD83 17, UTM
8

MERGE
• Merged Crime Incidents
• 2012 Crime Incidents (NAD83-17N)
• Merged Campus Areas
• University College Campuses (NAD83-17N)
• Campus Areas Zoning (NAD83-17N)
9

DC LAYER: SELECT
• Data from DC Open data does not have any shapefiles for the
general DC area
• UnitedStates.gdb has a USStates shapefile, which includes
DC as one state
• Selected DC from USStates
• Exported selection as its own DC layer (DC_NAD8317)
10

INITIAL DATA VISUALIZATIONS
11

INITIAL DATA VISUALIZATIONS
12

CREATE BUFFERS
• Create buffer around campus areas
• 1000 ft. is a standard buffer for schools
• 1000 ft = 305m
• 305m as the first buffer distance
• 1500 ft, 458m and 2000 ft, 610m* as second and third buffer distances
• SELECT crime points within buffers
• 40,873 incidents with 2000 ft. or 610m* of campus areas
14

INITIAL TIME SERIES VISUALIZATION
16
Buffer Distance: 2000 ft./610 m

BUFFERED FREQUENCIES BY TYPE
17

BUFFERED FREQUENCIES BY TYPE
Offense Frequency - all data Frequency - data in buffer 1 Frequency - data in buffer 2 Frequency - data in buffer 3
Arson 95 11 15 21
Assault w/ dangerous weapon 7224 783 1174 1533
Burglary 10151 1470 2247 2991
Homicide 296 28 39 52
Motor vehicle theft 8633 967 1573 2070
Robbery 11470 1568 2437 3341
Sex abuse 862 171 232 269
Theft f/ auto 31299 6283 9650 13162
Theft / other 39269 8540 13561 17414
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
Arson Assault w/
dangerous
weapon
Burglary Homicide Motor
vehicle theft
Robbery Sex abuse Theft f/ auto Theft /
other
Frequency - data in buffer 1
Frequency Differences: Buffered Data Type Frequencies/All Data Type Frequencies
18

• “Summarizes a set of points into a netCDF data structure by aggregating
them into space-time bins.”
• “Within each bin, the points are counted.”
• “For all bin locations, the trend for counts over time are evaluated.”
SPACE-TIME PATTERN MINING TOOLBOX:
CREATE SPACE TIME CUBE
19
Photo Sources: https://desktop.arcgis.com/en/desktop/latest/tools/space-time-pattern-mining-
toolbox/learnmorecreatecube.htm

• “Identifies trends in the clustering of point densities (counts) or
summary fields in a space time cube created.”
• “Categories: (1) new, (2) consecutive, (3) intensifying, (4) persistent,
(5) diminishing, (6) sporadic, (7) oscillating, and (8) historical hot and
cold spots.”
SPACE-TIME PATTERN MINING TOOLBOX:
EMERGING HOT SPOT ANALYSIS
20

CONVERT TIME
• Original date variable = String type
• Date type is required for Space-Time Pattern Mining Tools
• Report date, Start date, End date
21

CREATE SPACE TIME CUBE
• Time Step Alignment: start time, end time, reference time
• End time; eliminates bias from choosing a reference time
• Time Step Interval: 1 Weeks
• Distance Interval:
• Calculated optimal interval based on algorithm that considers
spatial distribution (histogram bin-width optimization)
• Template Cube:
• did not use
23

STATISTICS FROM SPACE TIME CUBE
• Mann-Kendall Statistic
• Statistical question: “are the events represented by the input points increasing or
decreasing over time?”
• Answer: “the number of points for all locations in each time-step interval,
analyzed as a time series of count values”
• Rank correlation analysis for the bin count or value and their time sequence
• +1 if 1st bin < 2nd bin
• -1 if 1st bin > 2nd bin
• 0 if 1st bin = 2nd bin)
• Results are summed
• Observed sum compared to expected sum
• p-value
• small p-value: the trend is statistically significant
24

STATISTICS FROM SPACE TIME CUBE
All Data Buffered Data 3
Total Number of Locations 8989 8400
Locations with at least one point 2945 1814
Associated Bins 5289220 3198082
% Non-zero Sparseness 1.54 1.03
Time Step Interval 1 week 1 week
Distance Interval 201m 134m
Number of Time Steps 1796
Cube Extent Across Space
Min Y 836904.0313m 837838.3256m
Min X 4303610.9737m 4309021.9271m
Max X 854777.2083m 849063.8549m
Max Y 4323770.1568m 4322418.4232m
Rows 101 100
Columns 89 84
Total bins 16144244 14809200
Overall Data Trend
Trend Direction Increasing
Trend Statistic 24.3788 23.9058
Trend p-value 0 0
25

• Theft/Other: Increasing Trend, Significant
• Theft f/Auto: Increasing Trend, Significant
• Robbery: Decreasing Trend (>-5), Significant
• Burglary: Increasing Trend (<5 though unlike others >18), Significant (> 0
though unlike 0)
• Motor Vehicle Theft: Increasing Trend (<6), Significant
• Assault with Dangerous Weapon: Decreasing Trend, Insignificant
• Sex Assault: Increasing Trend, Significant
• Homicide: too few records for analysis
• Arson: too few records for analysis
STATISTICS FROM SPACE TIME CUBES BY TYPE
26

CONCLUSIONS
• Around Howard University there are significant patterns of theft/auto crimes and also sexual
assault
• Theft/auto incidents are emerging consecutively
• Sexual Assault incidents emerging sporadically
• These types are two reported (among with others) in the media
• Around George Washington University and Georgetown University there are also significant
patterns of theft/other crimes
• Theft/other incidents are emerging consecutively to the non-river, and also Howard University
side of George Washington University
• Theft/other incidents are emerging sporadically on the river-side of Georgetown University
and George Washington University
• Both types of theft are emerging consecutively around other US colleges’ and universities’ DC
campuses
• Cornell in Washington, University of California Washington Center
31

FUTURE WORK
32
• More years of crime data?
• Experiment with ArcGIS Pro or ArcGIS’s multidimensional toolbox
to visualize the space time box and understand it better
• Develop a Python script to tune, or optimize the time and distance
intervals as well as the buffer distance for more accurate results
• Experiment with the template cube for potentially more consistent
results across types

ArcGIS Space-Time Mining of Crime Data

Recommended

Recommended

More Related Content

Similar to ArcGIS Space-Time Mining of Crime Data

Similar to ArcGIS Space-Time Mining of Crime Data (20)

Recently uploaded

Recently uploaded (20)

ArcGIS Space-Time Mining of Crime Data