Search with PolygonsAnother Approach to Solr Geospatial SearchDr. Andrew L. UrquhartMay 10, 2012                          ...
What is the “Burning Platform”?§  Need to break dependency on expensive licenses for    proprietary database –  Major cos...
What Has Been Produced?§  A single add-in JAR file plus Schema enhancements –  Older variant requires a GPL library for p...
What is the Magic?§  Variant geohash coding –  64-bit long integers instead of strings –  Three most significant bits for...
What About Polygons?§  Polygons indexed as collection of tiles inside polygon –  Larger tiles completely contained in ind...
What About Polygon Search?§  Search polygon converted to tiles using indexing conversion    process –  Possible to get to...
How Is This Capability Used?§  Indexing accessed using custom FieldTypes in schema –  Specific types for each supported g...
What Geometries Are Supported?§  Points  –  Specified by latitude and longitude§  Polygons  –  Specified by latitude-lon...
How Can the Public Get This?§  Currently working Intellectual Property issues –  Employer required provisional patent app...
Summary§  Solr is excellent choice for our replacement of expensive    database§  Geospatial Search with Polygons in Sol...
Upcoming SlideShare
Loading in …5
×

Search with Polygons: Another Approach to Solr Geospatial Search

3,425 views

Published on

Presented by Andrew Urquhart | Raytheon - See conference video - http://www.lucidimagination.com/devzone/events/conferences/lucene-revolution-2012

After investigating the Lucene Spatial Playground approach to Solr geospatial search, Raytheon determined that the Lucene Spatial Playground was not evolving in a direction that would meet their needs. In particular, they required the ability to search for documents within a geospatial polygon and also desired a solution that would not require special handling at any point on the Earth specifically including the poles and the 180-degree East/West longitude meridian. Taking these requirements, they implemented a Solr/Lucene geospatial search capability that maps latitude/longitude points onto a spherical Earth and then operates in three-dimensional Cartesian space. Using the geohash algorithm modified to produce Long indices, Raytheon indexes the approximate locations of points as numeric values. This approach enables index lookup using Trie structures with numeric range queries. Come hear about their approach to Solr Geospacial Search.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,425
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
20
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Search with Polygons: Another Approach to Solr Geospatial Search

  1. 1. Search with PolygonsAnother Approach to Solr Geospatial SearchDr. Andrew L. UrquhartMay 10, 2012 Copyright © 2012 Raytheon Company. All rights reserved. Customer Success Is Our Mission is a registered trademark of Raytheon Company.
  2. 2. What is the “Burning Platform”?§  Need to break dependency on expensive licenses for proprietary database –  Major cost driver –  Unsustainable in current economic environment§  Solr identified as promising replacement candidate –  Excellent cost –  Excellent performance –  Excellent access to source code –  Major weakness in required Geospatial Search capability –  Is Geospatial Search weakness mitigation possible? §  Must index points for search by polygons §  Should index polygons for search by polygons Solr promising, Polygon Geospatial Search needed 5/16/12 2
  3. 3. What Has Been Produced?§  A single add-in JAR file plus Schema enhancements –  Older variant requires a GPL library for point-in-polygon support –  Newer variant requires no external libraries§  Internals use inherent three-dimensional mathematics –  LUCENE-3795/“Lucene Spatial Playground” geospatial search capability uses JTS library for polygon support –  JTS uses two-dimensional mathematics –  JTS has greater vulnerability to special points §  North and South Poles §  180° meridian §  Potential problem for customer applications –  JTS supports complex polygons §  Alternative approach only supports simple polygons at this time Single JAR file using 3-D internal mathematics 5/16/12 3
  4. 4. What is the Magic?§  Variant geohash coding –  64-bit long integers instead of strings –  Three most significant bits for octants of Earth’s surface §  Dividing at equator, prime meridian, 90° E/W meridians, 180° meridian –  Followed by three-bit groups §  One stop/continue bit §  One north/south split bit §  One east/west split bit –  Allows precision down to 10 cm × 10 cm squares at equator –  Produces various-size “tiles” representing parts of Earth’s surface§  Points indexed by the smallest tile which contains the point E/W N/S E/W C/S N/S E/W C/S N/S E/W C/S N/S E/W C/S N/S E/W Indexing using 64-bit integers for trie-driven search 5/16/12 4
  5. 5. What About Polygons?§  Polygons indexed as collection of tiles inside polygon –  Larger tiles completely contained in indexed polygon are not subdivided –  Smallest indexed tiles may extend outside indexed polygon Polygons indexed with series of hash codes 5/16/12 5
  6. 6. What About Polygon Search?§  Search polygon converted to tiles using indexing conversion process –  Possible to get too many tile indices to search §  Risks Lucene complaints about too many of BooleanClauses §  Consolidate adjacent indices into ranges §  Reduce tiling precision –  Reduce number of ranges –  Produce acceptable number of BooleanClauses§  Results filtered by original search polygon –  Requires storage of original geometry data in addition to index –  No filter query required §  Index always accessed with NumericRangeQuery §  Insert custom logic wrapping NumericRangeQuery Search similar to indexing with additional filtering 5/16/12 6
  7. 7. How Is This Capability Used?§  Indexing accessed using custom FieldTypes in schema –  Specific types for each supported geometry type –  A general type to allow polymorphic geometry types §  Trade-off is greater application coupling –  Specific type classes transform inputs and hand-off to general type class –  Indexing writes out two fields §  Geospatial tile index §  Original geometry storage§  Search accessed using custom QParserPlugin –  Detects special suffixes on search field name to determine geometry type –  Converts input to geospatial tile index collection –  Builds Lucene query structure including custom and standard classes New schema FieldTypes and new QParserPlugin 5/16/12 7
  8. 8. What Geometries Are Supported?§  Points –  Specified by latitude and longitude§  Polygons –  Specified by latitude-longitude pairs§  Latitude-Longitude Boxes –  Specified by two latitude-longitude pairs specifying opposite corners –  Internally converted to polygons§  Point-Radii –  Specified by latitude and longitude of center plus radius in meters, kilometers, statute miles, or nautical miles –  Assumes spherical Earth NOT WGS-84 ellipsoid §  Errors accepted for search –  Internally converted to approximating polygons Latitude-Longitude Boxes and Point-Radii supported 5/16/12 8
  9. 9. How Can the Public Get This?§  Currently working Intellectual Property issues –  Employer required provisional patent application submission before Lucene Revolution abstract could be submitted §  Could protect public use of license assuming public release –  Customer has Unrestricted Rights §  Customer can release to public open source community §  Customer may release to public open source community –  Customer dislikes proprietary solutions§  Also need to work packaging issues such as a name Not yet available to public, but that may change 5/16/12 9
  10. 10. Summary§  Solr is excellent choice for our replacement of expensive database§  Geospatial Search with Polygons in Solr is possible and implemented –  Can be used with or without LUCENE-3795/Lucene Spatial Playground” approach –  Inherent 3-dimensional mathematics not found in LUCENE-3795 polygon support –  Stores and uses both indices and original geometries –  No support for complex polygons at this time§  Capabilities accessed with new FieldTypes and a new QParserPlugin§  Not yet released to public 5/16/12 10

×