In essence, HunchLab is a new way of analyzing events in space and time. The software allows one to look at point density increases or decreases using statistical algorithms – far better than humans can.
HunchLab currently works with Crime Data. In a process known as CompStat, police officers typically meet monthly to discuss trends and analyze data. The idea behind HunchLab is turn this into an interactive, continuous process rather than a slow large meeting.
In Compstat, you might look at something like this. This traditional “pin map” of crime is somewhat helpful, but leaves a lot questions unanswered. It is more or less raw data – without better tools only basic things can be inferred from it.
The problem becomes worse with real data. We are rolling out HunchLab to the Philadelphia Police Department. They have over 15 million+ incidents in their database, and growing.
In looking at this data, we want the “Actionable Intelligence”. We want to be able to allocate personnel to places that don’t have enough. See where unusual things are happening, and have enough information to act on it. Pin maps don’t do it.
We might try this – the heat map. Red means more crime. People like heat maps. But with crime you’ll see hot spots in bad neighborhoods. Everyone knows where these are. Therefore, this is not particularly useful to police personnel.
This is the same spatial area. What has changed in the last week? Its hard to tell if something unusual is going on. But if there is, we can change in response to it – send more people, ask questions, etc.
First we need to focus on a particular area or areas of interest. Once that’s done, we have a working set of data. We can use statistics to analyze the situation. The way this happens is not obvious. We call crimes balls in a bag…
If we think of crimes as balls in a bag and break them up into “Black Balls” – those in our current spatial extent - and “White Balls” – those not in our spatial extent - we can apply something called the Hypergeometric Distribution to them – probabilities!
Time here is the Z-Axis. This diagram shows how a circular region drills through the space-time in question. The number of crimes in each colored section is totaled to give parameters to our Hypergeometric Equation.
We can apply the algorithm across many spatial areas – user defined, for example, or shotgun-like, blanketing an area with smaller areas of interest. Either way, the algorithm has identified when density increases are statistically significant!
This is fine, from a theoretical perspective. But for this to be useful, we need to be able to harness this algorithm with a tool that allows us to apply it systematically and see results. So we developed HunchLab to do that.
A police captain or detective might typically start using HunchLab after receiving a notification from the system like this. The user would have to either explicitly subscribe to this hunch, or be part of a group which gets these sort of hunches.
Our investigator can see that the hunch has been valid for the last three days. All pertinent information about the hunch is displayed in the panel. It seems thefts are on the increase in the area!
The graph displays counts of events in the area over time. In this case, it is easy to see why this hunch is valid – no other period has as much crime. *This* is actionable intelligence – what is going on here?
Our investigator chooses to look deeper; when a bar on a graph is clicked the events that are in the bar are shown on the map. The system provides useful knowledge that can help figure out why there is this crime increase.
Our investigator sets up another hunch nearby, filling out relevant information easily. Instantly, the results can be viewed – and the hunch will remain in the system and be evaluated nightly for validity.
The hunch we are investigating though is not created explicitly by a person. Rather it is a Mass Hunch – the process of blanketing an area with hunches. However, creating a mass hunch is much like creating a regular hunch.
Our investigator now looks in the system for other hunches in the system near the area of interest. Unless the hunch has been marked “private”, he can view and subscribe to this hunch as if it is his or her own. This allows HunchLab to be used as a collaborative tool.
Crime is really only the beginning. HunchLab sees any source of events over time as identical – it doesn’t care if its crime, disease, sales, whatever. Being able to use the same software package to analyze these very different types of data is really exciting!
What HunchLab Is….IS a new way to:• Look at crime density changes• Get notified about them• Get more information NOT a way to understand Igor Slide 1
A Visualization Solution Becomes a Problem • 1.5 million incidents per year • Around 40 crime classifications • Approximately 6900 sworn officers Slide 4
A Classic Problem –The “Data Tornado” By Golly Toto, it’s a twister! Slide 5
Where Do They Clump?From http://data.baltimoresun.com/crime/anne_arundel/accident-person-injured/?mode=heat (not HunchLab) Slide 6
What We Really Want to Know – How are the Densities CHANGING? ? Crimes 8/16/2008 – 8/22/2008 Crimes 8/23/2008 – 8/30/2008 Slide 7
How Do We Figure out Density Change?•Take an area or *many* areas to evaluate• Use statistics to compare historical norms to current densities• But… how? • Call crimes balls in a bag. Slide 8
That’s Right, Balls In a BagThat’s Right, Balls in a Bag A bag has 20 black balls and 80 white balls. Choose 5 at random. What is the chance of getting 3 black balls? Vary number of balls chosen and you get a Hypergeometric Distribution Slide 9