• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Search with Polygons: Another Approach to Solr Geospatial Search

Search with Polygons: Another Approach to Solr Geospatial Search



Presented by Andrew Urquhart | Raytheon - See conference video - http://www.lucidimagination.com/devzone/events/conferences/lucene-revolution-2012 ...

Presented by Andrew Urquhart | Raytheon - See conference video - http://www.lucidimagination.com/devzone/events/conferences/lucene-revolution-2012

After investigating the Lucene Spatial Playground approach to Solr geospatial search, Raytheon determined that the Lucene Spatial Playground was not evolving in a direction that would meet their needs. In particular, they required the ability to search for documents within a geospatial polygon and also desired a solution that would not require special handling at any point on the Earth specifically including the poles and the 180-degree East/West longitude meridian. Taking these requirements, they implemented a Solr/Lucene geospatial search capability that maps latitude/longitude points onto a spherical Earth and then operates in three-dimensional Cartesian space. Using the geohash algorithm modified to produce Long indices, Raytheon indexes the approximate locations of points as numeric values. This approach enables index lookup using Trie structures with numeric range queries. Come hear about their approach to Solr Geospacial Search.



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Search with Polygons: Another Approach to Solr Geospatial Search Search with Polygons: Another Approach to Solr Geospatial Search Presentation Transcript

    • Search with PolygonsAnother Approach to Solr Geospatial SearchDr. Andrew L. UrquhartMay 10, 2012 Copyright © 2012 Raytheon Company. All rights reserved. Customer Success Is Our Mission is a registered trademark of Raytheon Company.
    • What is the “Burning Platform”?§  Need to break dependency on expensive licenses for proprietary database –  Major cost driver –  Unsustainable in current economic environment§  Solr identified as promising replacement candidate –  Excellent cost –  Excellent performance –  Excellent access to source code –  Major weakness in required Geospatial Search capability –  Is Geospatial Search weakness mitigation possible? §  Must index points for search by polygons §  Should index polygons for search by polygons Solr promising, Polygon Geospatial Search needed 5/16/12 2
    • What Has Been Produced?§  A single add-in JAR file plus Schema enhancements –  Older variant requires a GPL library for point-in-polygon support –  Newer variant requires no external libraries§  Internals use inherent three-dimensional mathematics –  LUCENE-3795/“Lucene Spatial Playground” geospatial search capability uses JTS library for polygon support –  JTS uses two-dimensional mathematics –  JTS has greater vulnerability to special points §  North and South Poles §  180° meridian §  Potential problem for customer applications –  JTS supports complex polygons §  Alternative approach only supports simple polygons at this time Single JAR file using 3-D internal mathematics 5/16/12 3
    • What is the Magic?§  Variant geohash coding –  64-bit long integers instead of strings –  Three most significant bits for octants of Earth’s surface §  Dividing at equator, prime meridian, 90° E/W meridians, 180° meridian –  Followed by three-bit groups §  One stop/continue bit §  One north/south split bit §  One east/west split bit –  Allows precision down to 10 cm × 10 cm squares at equator –  Produces various-size “tiles” representing parts of Earth’s surface§  Points indexed by the smallest tile which contains the point E/W N/S E/W C/S N/S E/W C/S N/S E/W C/S N/S E/W C/S N/S E/W Indexing using 64-bit integers for trie-driven search 5/16/12 4
    • What About Polygons?§  Polygons indexed as collection of tiles inside polygon –  Larger tiles completely contained in indexed polygon are not subdivided –  Smallest indexed tiles may extend outside indexed polygon Polygons indexed with series of hash codes 5/16/12 5
    • What About Polygon Search?§  Search polygon converted to tiles using indexing conversion process –  Possible to get too many tile indices to search §  Risks Lucene complaints about too many of BooleanClauses §  Consolidate adjacent indices into ranges §  Reduce tiling precision –  Reduce number of ranges –  Produce acceptable number of BooleanClauses§  Results filtered by original search polygon –  Requires storage of original geometry data in addition to index –  No filter query required §  Index always accessed with NumericRangeQuery §  Insert custom logic wrapping NumericRangeQuery Search similar to indexing with additional filtering 5/16/12 6
    • How Is This Capability Used?§  Indexing accessed using custom FieldTypes in schema –  Specific types for each supported geometry type –  A general type to allow polymorphic geometry types §  Trade-off is greater application coupling –  Specific type classes transform inputs and hand-off to general type class –  Indexing writes out two fields §  Geospatial tile index §  Original geometry storage§  Search accessed using custom QParserPlugin –  Detects special suffixes on search field name to determine geometry type –  Converts input to geospatial tile index collection –  Builds Lucene query structure including custom and standard classes New schema FieldTypes and new QParserPlugin 5/16/12 7
    • What Geometries Are Supported?§  Points –  Specified by latitude and longitude§  Polygons –  Specified by latitude-longitude pairs§  Latitude-Longitude Boxes –  Specified by two latitude-longitude pairs specifying opposite corners –  Internally converted to polygons§  Point-Radii –  Specified by latitude and longitude of center plus radius in meters, kilometers, statute miles, or nautical miles –  Assumes spherical Earth NOT WGS-84 ellipsoid §  Errors accepted for search –  Internally converted to approximating polygons Latitude-Longitude Boxes and Point-Radii supported 5/16/12 8
    • How Can the Public Get This?§  Currently working Intellectual Property issues –  Employer required provisional patent application submission before Lucene Revolution abstract could be submitted §  Could protect public use of license assuming public release –  Customer has Unrestricted Rights §  Customer can release to public open source community §  Customer may release to public open source community –  Customer dislikes proprietary solutions§  Also need to work packaging issues such as a name Not yet available to public, but that may change 5/16/12 9
    • Summary§  Solr is excellent choice for our replacement of expensive database§  Geospatial Search with Polygons in Solr is possible and implemented –  Can be used with or without LUCENE-3795/Lucene Spatial Playground” approach –  Inherent 3-dimensional mathematics not found in LUCENE-3795 polygon support –  Stores and uses both indices and original geometries –  No support for complex polygons at this time§  Capabilities accessed with new FieldTypes and a new QParserPlugin§  Not yet released to public 5/16/12 10