Your SlideShare is downloading. ×
2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

2011 NIJ Crime Mapping Conference - Data Mining and Risk Forecasting in Web-based Analysis Tools

552
views

Published on

Published in: Technology, Business

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
552
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • We make custom map requests, provide data for CompStat and support a Web site that is updated daily to support daily functions of districts.
  • Transcript

    • 1. 340 N 12 th St, Suite 402 Philadelphia, PA 19107 215.925.2600 [email_address] www.azavea.com/hunchlab Data Mining and Risk Forecasting in Web-Based Analysis Tools
    • 2. Agenda
      • Who we are
      • What we are looking to do with software
      • What we’ve learned building risk forecasting features
        • (forecasting versus prediction)
    • 3. About Azavea
      • Founded in 2000
      • 30 people
      • Based in Philadelphia
        • Boston office
        • Minneapolis office
      • Geospatial + web + mobile
        • Software development
        • Spatial analysis services
    • 4. Clients & Industries
      • Public Safety
      • Municipal Services
      • Public Health
      • Human Services
      • Culture
      • Elections & Politics
      • Land Conservation
      • Economic Development
    • 5. HunchLab was developed, in part, based upon work supported by the National Science Foundation under Grant Nos. IIP-0637589 and IIP-0750507.
    • 6. The Backstory
    • 7. How Phila PD uses GIS
      • Customized Map Products
      Weekly CompStat Meetings Web Crime Analysis
    • 8. Complainant 911 Operator Radio Dispatcher Police Officer District 48 Desk Daily download & Geocoding Routines Incident Report Completed by Officer Maps distributed Through Intranet, Printing, CompStat INCT & PARS – main database sources  over 5,000 incidents daily, over 2 million annually CAD Verizon 911 INCT District X District Y District Z PARS
    • 9. The Context
      • 1,500,000 people
      • 7,000 police officers
      • 1,000 civilian employees
      • 2,000,000 new incidents / year
      3 crime analysts
    • 10. How can software help?
    • 11. Our goals with software
      • Improve ease of use
        • Increase consumers of analysis
      • Automate time-intensive routines
        • Free up resources for things that can’t be automated
      • Increase sophistication
        • Accomplish things not possible manually
    • 12.
      • web-based crime analysis, early warning, and risk forecasting
    • 13.
      • Crime Analysis
        • Mapping (spatial / temporal densities)
        • Trending
        • Intelligence Dashboard
      • Early Warning
        • Statistical & Threshold-based Hunches (data mining)
        • Alerting
      • Risk Forecasting
        • Near Repeat Pattern
        • Load Forecasting
    • 14. Near Repeat Pattern Analysis
    • 15. Contagious Crime?
      • Near repeat pattern analysis
          • “ If one burglary occurs, how does the risk change nearby?”
    • 16. Near Repeat Pattern Analysis
      • How can you test your own data?
        • Near Repeat Calculator
          • http://www.temple.edu/cj/misc/nr/
      • Papers
        • Near-Repeat Patterns in Philadelphia Shootings (2008)
          • One city block & two weeks after one shooting
            • 33% increase in likelihood of a second event
      Jerry Ratcliffe Temple University
    • 17. Near Repeat Pattern Analysis
      • The goal:
        • Quantify short term risk due to near-repeat victimization
          • “ If one burglary occurs, how does the risk of burglary for the neighbors change?”
      • What we know:
        • Incident A (place, time) --> Incident B (place, time)
          • Distance between A and B
          • Timeframe between A and B
      • What we need to know:
        • What distances/timeframes are not simply random?
    • 18. Near Repeat Pattern Analysis
      • The process
        • Observe the pattern in historic data
        • Simulate the pattern in randomized historic data
        • Compare the observed pattern to the simulated patterns
        • Apply the non-random pattern to new incidents
      • An example
        • 180 days of burglaries in Division 6 of Philadelphia
    • 19. Near Repeat Pattern Analysis
    • 20. Near Repeat Pattern Analysis
    • 21. Near Repeat Pattern Analysis
    • 22. Near Repeat Pattern Analysis
    • 23. Near Repeat Pattern Analysis
      • What did we learn?
        • Having a reference implementation was very helpful
          • Aids in translation of research into software
        • Analysts simplify things to make operationalization possible
          • They simplify risk bands to ease map making
        • Academics leave large questions unanswered
          • What happens when risk areas overlap?
    • 24.  
    • 25. Load Forecasting
    • 26. Improving CompStat
      • Load forecasting
          • “ Given the time of year, day of week, time of day and general trend, what counts of crimes should I expect?”
    • 27. What Do We Mean By Load Forecasting?
      • Load forecasting
          • Generating aggregate crime counts for a future timeframe using cyclical time series analysis
      bit.ly/gorrcrimeforecastingpaper Measure cyclical patterns Identify non-cyclical trend Forecast expected count +
    • 28. Load Forecasting
      • Measure cyclical patterns
          • Take historic incidents (for example: last five years)
          • Generate multiplicative seasonal indices
            • For each time cycle:
              • time of year
              • day of week
              • time of day
            • Count incidents within each time unit (for example: Monday)
            • Calculate average per time unit if incidents were evenly distributed
            • Divide counts within each time unit by the calculated average to generate multiplicative indices
              • Index ~ 1 means at the average
              • Index > 1 means above average
              • Index < 1 means below average
    • 29. Load Forecasting
    • 30. Load Forecasting
    • 31. Load Forecasting
    • 32. Load Forecasting
    • 33. Load Forecasting
      • Identify non-cyclical trend
          • Take recent daily counts (for example: last year daily counts)
          • Remove cyclical trends by dividing by indices
          • Run a trending function on the new counts
            • Simple average
              • Last X Days
            • Smoothing function
              • Exponential smoothing
              • Holt’s linear exponential smoothing
    • 34. Load Forecasting
      • Forecast expected count
          • Project trend into future timeframe
            • Always flat
              • Simple average
              • Exponential smoothing
            • Linear trend
              • Holt’s linear exponential smoothing
          • Multiple by seasonal indices to reseasonalize the data
    • 35. Load Forecasting bit.ly/gorrcrimeforecastingpaper Measure cyclical patterns Identify non-cyclical trend Forecast expected count +
    • 36. How Do We Know It ’s Accurate?
      • Testing
          • Generated multiple forecasting techniques (examples)
            • Commonly Used
              • Average of last 30 days
              • Average of last 365 days
              • Last year’s count for the same time period
            • Advanced Combinations
              • Different cyclical indices (example: day of year vs. month of year)
              • Different levels of geographic aggregation for indices
              • Different trending functions
          • Scoring methodologies (examples)
            • Mean absolute percent error (with some enhancements)
            • Mean percent error
            • Mean squared error
          • Run thousands of forecasts through testing framework
          • Choose the right technique in the right situation
    • 37. How Do We Know It ’s Accurate?
      • Error for 28-31 day forecasts for any Part X Series
      Last 30 Days Last Year Load Forecast Error Reduction Philadelphia - Citywide 6.8% 6.5% 4.1% 39% Philadelphia - Divisions 8.1% 8.4% 5.8% 28% Philadelphia - Districts 10.9% 11.7% 9.3% 15% Lincoln, NE - Citywide 13% 11% 10% 23%
    • 38. Improving CompStat
      • Load forecasting
          • “ Given the time of year, day of week, time of day and general trend, what counts of crimes should I expect?”
    • 39. Improving CompStat
    • 40. What’s next?
    • 41. Contact Information Jeremy Heffner HunchLab Product Manager [email_address] 215.701.7712 www.azavea.com/hunchlab

    ×