Be the first to like this
See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011
Law enforcement data has many interesting complexities for search. Cross-agency searches are even
more challenging because each agency has its own shorthand. Many different types of similarity
between search clauses and documents should influence the ranking of results. For example, a
search clause mentioning a “tall suspect” might want to include results with “6 foot 4 suspect”.
Spatial clusters are important, as are temporal patterns. Different fields may be more or less
important depending on the type of crime—for example, a victim’s race may matter more than a
vehicle’s make in a sex crime but less in an auto theft. Also, documents may be related to each other
in various ways that may also affect their ideal search ranking.
Solr’s great flexibility in its analyzers, filters, synonyms, and boosting make it excellent tool for such
diverse requirements. We’ve contributed a patch to Solr (#SOLR-2058) that helped further improve
search result ranking for cases where a search for a suspect with a “red baseball cap, black leather
jacket” is compared against many documents mentioning red caps, black caps, etc. This presentation
will describe how we addressed some domain-specific challenges of our data.