Your SlideShare is downloading. ×
0
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools

583

Published on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
583
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • We make custom map requests, provide data for CompStat and support a Web site that is updated daily to support daily functions of districts.
  • Transcript

    • 1. 340 N 12 th St, Suite 402 Philadelphia, PA 19107 215.925.2600 [email_address] www.azavea.com/hunchlab Data Mining and Risk Forecasting in Web-Based Analysis Tools
    • 2. Agenda <ul><li>Who we are </li></ul><ul><li>What we are looking to do with software </li></ul><ul><li>What we’ve learned building risk forecasting features </li></ul><ul><ul><li>(forecasting versus prediction) </li></ul></ul>
    • 3. About Azavea <ul><li>Founded in 2000 </li></ul><ul><li>30 people </li></ul><ul><li>Based in Philadelphia </li></ul><ul><ul><li>Boston office </li></ul></ul><ul><ul><li>Minneapolis office </li></ul></ul><ul><li>Geospatial + web + mobile </li></ul><ul><ul><li>Software development </li></ul></ul><ul><ul><li>Spatial analysis services </li></ul></ul>
    • 4. Clients &amp; Industries <ul><li>Public Safety </li></ul><ul><li>Municipal Services </li></ul><ul><li>Public Health </li></ul><ul><li>Human Services </li></ul><ul><li>Culture </li></ul><ul><li>Elections &amp; Politics </li></ul><ul><li>Land Conservation </li></ul><ul><li>Economic Development </li></ul>
    • 5. HunchLab was developed, in part, based upon work supported by the National Science Foundation under Grant Nos. IIP-0637589 and IIP-0750507.
    • 6. The Backstory
    • 7. How Phila PD uses GIS <ul><li>Customized Map Products </li></ul>Weekly CompStat Meetings Web Crime Analysis
    • 8. Complainant 911 Operator Radio Dispatcher Police Officer District 48 Desk Daily download &amp; Geocoding Routines Incident Report Completed by Officer Maps distributed Through Intranet, Printing, CompStat INCT &amp; PARS – main database sources  over 5,000 incidents daily, over 2 million annually CAD Verizon 911 INCT District X District Y District Z PARS
    • 9. The Context <ul><li>1,500,000 people </li></ul><ul><li>7,000 police officers </li></ul><ul><li>1,000 civilian employees </li></ul><ul><li>2,000,000 new incidents / year </li></ul>3 crime analysts
    • 10. How can software help?
    • 11. Our goals with software <ul><li>Improve ease of use </li></ul><ul><ul><li>Increase consumers of analysis </li></ul></ul><ul><li>Automate time-intensive routines </li></ul><ul><ul><li>Free up resources for things that can’t be automated </li></ul></ul><ul><li>Increase sophistication </li></ul><ul><ul><li>Accomplish things not possible manually </li></ul></ul>
    • 12. <ul><li>web-based crime analysis, early warning, and risk forecasting </li></ul>
    • 13. <ul><li>Crime Analysis </li></ul><ul><ul><li>Mapping (spatial / temporal densities) </li></ul></ul><ul><ul><li>Trending </li></ul></ul><ul><ul><li>Intelligence Dashboard </li></ul></ul><ul><li>Early Warning </li></ul><ul><ul><li>Statistical &amp; Threshold-based Hunches (data mining) </li></ul></ul><ul><ul><li>Alerting </li></ul></ul><ul><li>Risk Forecasting </li></ul><ul><ul><li>Near Repeat Pattern </li></ul></ul><ul><ul><li>Load Forecasting </li></ul></ul>
    • 14. Near Repeat Pattern Analysis
    • 15. Contagious Crime? <ul><li>Near repeat pattern analysis </li></ul><ul><ul><ul><li>“ If one burglary occurs, how does the risk change nearby?” </li></ul></ul></ul>
    • 16. Near Repeat Pattern Analysis <ul><li>How can you test your own data? </li></ul><ul><ul><li>Near Repeat Calculator </li></ul></ul><ul><ul><ul><li>http://www.temple.edu/cj/misc/nr/ </li></ul></ul></ul><ul><li>Papers </li></ul><ul><ul><li>Near-Repeat Patterns in Philadelphia Shootings (2008) </li></ul></ul><ul><ul><ul><li>One city block &amp; two weeks after one shooting </li></ul></ul></ul><ul><ul><ul><ul><li>33% increase in likelihood of a second event </li></ul></ul></ul></ul>Jerry Ratcliffe Temple University
    • 17. Near Repeat Pattern Analysis <ul><li>The goal: </li></ul><ul><ul><li>Quantify short term risk due to near-repeat victimization </li></ul></ul><ul><ul><ul><li>“ If one burglary occurs, how does the risk of burglary for the neighbors change?” </li></ul></ul></ul><ul><li>What we know: </li></ul><ul><ul><li>Incident A (place, time) --&gt; Incident B (place, time) </li></ul></ul><ul><ul><ul><li>Distance between A and B </li></ul></ul></ul><ul><ul><ul><li>Timeframe between A and B </li></ul></ul></ul><ul><li>What we need to know: </li></ul><ul><ul><li>What distances/timeframes are not simply random? </li></ul></ul>
    • 18. Near Repeat Pattern Analysis <ul><li>The process </li></ul><ul><ul><li>Observe the pattern in historic data </li></ul></ul><ul><ul><li>Simulate the pattern in randomized historic data </li></ul></ul><ul><ul><li>Compare the observed pattern to the simulated patterns </li></ul></ul><ul><ul><li>Apply the non-random pattern to new incidents </li></ul></ul><ul><li>An example </li></ul><ul><ul><li>180 days of burglaries in Division 6 of Philadelphia </li></ul></ul>
    • 19. Near Repeat Pattern Analysis
    • 20. Near Repeat Pattern Analysis
    • 21. Near Repeat Pattern Analysis
    • 22. Near Repeat Pattern Analysis
    • 23. Near Repeat Pattern Analysis <ul><li>What did we learn? </li></ul><ul><ul><li>Having a reference implementation was very helpful </li></ul></ul><ul><ul><ul><li>Aids in translation of research into software </li></ul></ul></ul><ul><ul><li>Analysts simplify things to make operationalization possible </li></ul></ul><ul><ul><ul><li>They simplify risk bands to ease map making </li></ul></ul></ul><ul><ul><li>Academics leave large questions unanswered </li></ul></ul><ul><ul><ul><li>What happens when risk areas overlap? </li></ul></ul></ul>
    • 24. &nbsp;
    • 25. Load Forecasting
    • 26. Improving CompStat <ul><li>Load forecasting </li></ul><ul><ul><ul><li>“ Given the time of year, day of week, time of day and general trend, what counts of crimes should I expect?” </li></ul></ul></ul>
    • 27. What Do We Mean By Load Forecasting? <ul><li>Load forecasting </li></ul><ul><ul><ul><li>Generating aggregate crime counts for a future timeframe using cyclical time series analysis </li></ul></ul></ul>bit.ly/gorrcrimeforecastingpaper Measure cyclical patterns Identify non-cyclical trend Forecast expected count +
    • 28. Load Forecasting <ul><li>Measure cyclical patterns </li></ul><ul><ul><ul><li>Take historic incidents (for example: last five years) </li></ul></ul></ul><ul><ul><ul><li>Generate multiplicative seasonal indices </li></ul></ul></ul><ul><ul><ul><ul><li>For each time cycle: </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>time of year </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>day of week </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>time of day </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>Count incidents within each time unit (for example: Monday) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Calculate average per time unit if incidents were evenly distributed </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Divide counts within each time unit by the calculated average to generate multiplicative indices </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Index ~ 1 means at the average </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Index &gt; 1 means above average </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Index &lt; 1 means below average </li></ul></ul></ul></ul></ul>
    • 29. Load Forecasting
    • 30. Load Forecasting
    • 31. Load Forecasting
    • 32. Load Forecasting
    • 33. Load Forecasting <ul><li>Identify non-cyclical trend </li></ul><ul><ul><ul><li>Take recent daily counts (for example: last year daily counts) </li></ul></ul></ul><ul><ul><ul><li>Remove cyclical trends by dividing by indices </li></ul></ul></ul><ul><ul><ul><li>Run a trending function on the new counts </li></ul></ul></ul><ul><ul><ul><ul><li>Simple average </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Last X Days </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>Smoothing function </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Exponential smoothing </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Holt’s linear exponential smoothing </li></ul></ul></ul></ul></ul>
    • 34. Load Forecasting <ul><li>Forecast expected count </li></ul><ul><ul><ul><li>Project trend into future timeframe </li></ul></ul></ul><ul><ul><ul><ul><li>Always flat </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Simple average </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Exponential smoothing </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>Linear trend </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Holt’s linear exponential smoothing </li></ul></ul></ul></ul></ul><ul><ul><ul><li>Multiple by seasonal indices to reseasonalize the data </li></ul></ul></ul>
    • 35. Load Forecasting bit.ly/gorrcrimeforecastingpaper Measure cyclical patterns Identify non-cyclical trend Forecast expected count +
    • 36. How Do We Know It ’s Accurate? <ul><li>Testing </li></ul><ul><ul><ul><li>Generated multiple forecasting techniques (examples) </li></ul></ul></ul><ul><ul><ul><ul><li>Commonly Used </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Average of last 30 days </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Average of last 365 days </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Last year’s count for the same time period </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>Advanced Combinations </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Different cyclical indices (example: day of year vs. month of year) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Different levels of geographic aggregation for indices </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Different trending functions </li></ul></ul></ul></ul></ul><ul><ul><ul><li>Scoring methodologies (examples) </li></ul></ul></ul><ul><ul><ul><ul><li>Mean absolute percent error (with some enhancements) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Mean percent error </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Mean squared error </li></ul></ul></ul></ul><ul><ul><ul><li>Run thousands of forecasts through testing framework </li></ul></ul></ul><ul><ul><ul><li>Choose the right technique in the right situation </li></ul></ul></ul>
    • 37. How Do We Know It ’s Accurate? <ul><li>Error for 28-31 day forecasts for any Part X Series </li></ul>Last 30 Days Last Year Load Forecast Error Reduction Philadelphia - Citywide 6.8% 6.5% 4.1% 39% Philadelphia - Divisions 8.1% 8.4% 5.8% 28% Philadelphia - Districts 10.9% 11.7% 9.3% 15% Lincoln, NE - Citywide 13% 11% 10% 23%
    • 38. Improving CompStat <ul><li>Load forecasting </li></ul><ul><ul><ul><li>“ Given the time of year, day of week, time of day and general trend, what counts of crimes should I expect?” </li></ul></ul></ul>
    • 39. Improving CompStat
    • 40. What’s next?
    • 41. Contact Information Jeremy Heffner HunchLab Product Manager [email_address] 215.701.7712 www.azavea.com/hunchlab

    ×