In version 2.4, there have been significant enhancements to the geospatial indexing capabilities in MongoDB, such as polygon intersections, a more accurate spherical model, and better integration with MongoDB's aggregation framework. In this presentation, you'll learn about the new enhancements and how they are enabling developers to more quickly and easily develop spatially-aware applications.
4. MongoDB has had geo for a
while
• `2d` index
– Store points on 2d plane
– Search for points within a:
• Rectangle ($box)
• Polygon ($polygon)
• Circle ($center)
• Circle on a sphere ($centerSphere)
– Search for nearest points ($near, $nearSphere)
2.4 Geospatial features – Ian Bentley
5. Some desirable things!
• Storing non-point geometries
• Within searches on a sphere
• Searching for intersecting geometries on a
sphere
• Better support for compound indexes
2.4 Geospatial features – Ian Bentley
6. Storing non-point geometries
• GeoJSON – A collaborative community project
that produced a specification for encoding
geometric entities in JSON
• Gaining wide support
– OpenLayers
– PostGIS
– Libraries in several languages
2.4 Geospatial features – Ian Bentley
7. GeoJSON allows us to
encode
Points:
{
geo: {
type: "Point",
coordinates: [100.0, 0.0]
}
}
2.4 Geospatial features – Ian Bentley
8. GeoJSON allows us to
encode
LineStrings:
{
geo: {
type: "LineString",
coordinates: [ [100.0, 0.0], [101.0, 1.0] ]
}
}
2.4 Geospatial features – Ian Bentley
9. GeoJSON allows us to
encode
Polygons:
{ geo: {
type: "Polygon",
coordinates: [
[ [100.0, 0.0], [101.0, 0.0],
[101.0, 1.0], [100.0, 1.0],
[100.0, 0.0] ]
]
} }
2.4 Geospatial features – Ian Bentley
10. Within searches on a sphere
• $geoWithin operator
• Takes a GeoJSON polygon geometry as a
specifier
• Returns any geometries of any type that are fully
contained within the polygon
• Works without any index.
2.4 Geospatial features – Ian Bentley
11. Intersecting geometries on a
sphere
• $geoIntersects operator
• Takes any GeoJSON geometry as a specifier
• Returns any geometries that have a non-empty
intersection
• Lots of edge cases – intersecting edges, points
on lines.
• Works without any index.
2.4 Geospatial features – Ian Bentley
12. Better support for compound
indexes
• Unlike 2d indexes, 2dsphere indexes aren’t
required to be the first field of a compound index
– Filtering potential documents before doing geo query can
drastically improve the performance of some queries
• “Find me Hot Dog Stands within New York state”
• “Find me geometries in New York state that are
Hot Dog Stands”
• Multiple geo fields can be in the same index
– “Find routes with start location 50miles from JFK and end
location 100miles from YYC”
2.4 Geospatial features – Ian Bentley
14. • You can find all the code, and data powering the
demo on github, and read about it on my blog
• Let’s take a close look at the python that does
the actual work.
2.4 Geospatial features – Ian Bentley
15. It’s this simple - within
def find_within(points):
# When defining a polygon, the first point should
# also appear as the last point.
points.append(points[0])
poly = {
"type": "Polygon",
"coordinates": [points]
}
places = collection.find(
{"geo": { "$within": { "$geometry": poly } } } )
places.limit(500)
return places
2.4 Geospatial features – Ian Bentley
16. It’s this simple - intersects
def find_intersects(points):
line = {
"type": "LineString",
"coordinates": points
}
places = collection.find(
{"geo":{ "$geoIntersects":
{ "$geometry": line } } } )
places.limit(50)
return places
2.4 Geospatial features – Ian Bentley
17. It’s this simple - near
def find_nearest(point):
point = {
"type": "Point",
"coordinates": point
}
places = collection.find(
{"geo": { "$near": { "$geometry": point } } })
places.limit(10)
return places
2.4 Geospatial features – Ian Bentley
19. How do you index a spherical
coordinate?
• Divide the geometry that you are indexing into a
grid.
• For each cell in the grid, calculate a key, based
upon its position on the sphere.
• Insert each cell into a standard B-tree
• MongoDB uses google’s S2 C++ library for the
heavy lifting.
2.4 Geospatial features – Ian Bentley
21. Coverings
• A covering of a geometry is a minimal set of cells
that completely cover’s a geometry
• S2 can efficiently generate coverings for arbitrary
geometries.
2.4 Geospatial features – Ian Bentley
22. Covering of Grid of the UK
2.4 Geospatial features – Ian Bentley
23. Covering of A4 surrounding
Trafalgar Square
2.4 Geospatial features – Ian Bentley
24. Cells
• S2 defines cell sizes from level 1 to level 31
• The higher the level, the smaller the cell
• Different levels are optimized for different queries
– If you have densely packed geometries, and you are
doing a $near search, a higher level will be efficient
– If you are doing a $within search with a large polygon, a
lower level will be more efficient
• By default we use all levels between 500m and
100km on a side
2.4 Geospatial features – Ian Bentley
25. Near search
2.4 Geospatial features – Ian Bentley
26. Near search
2.4 Geospatial features – Ian Bentley
27. Near search
2.4 Geospatial features – Ian Bentley
28. Near search
2.4 Geospatial features – Ian Bentley
29. Near search
2.4 Geospatial features – Ian Bentley
30. Near search
2.4 Geospatial features – Ian Bentley
Hit Record and make sure it recordsOpen your demo.Move your mouse.Make announcement about QA five minutes before and as you start
This is 6th grade geometry on the cartesian plane. Often called (inexactly) Euclidean geometryAn plane is infinite in all directions. This is a convenient way of reasoning about geometry because math on the plane is easy. As a simplification of a sphere, however, it has pretty big problems as soon as you start to worry about large polygons, long lines, or any degree of accuracy.
As is excellently highlighted by Randall Munroe of xkcd, projecting a sphere onto a plane is non-obvious. It’s similarly not easy the other direction.Managing the math for sphere’s is much more difficult than on a plane, and definitely not something most of us want to implement.
The 2d index was introduced in Mongodb 2.2End this slide by saying: “All this is great, but there are some additional features that we might like.”
Points are great, but we want to store arbitrary polygons, lines, etc.
Notice that the first point is the same as the last point.This is the simplest polygon form. The coordinate specification is a list of list of point specs. The first list of point specifications describes the exterior shell of the polygon, and each subsequent list of points describes a hole in the polygon.MongoDB will reject any polygons that self intersect with a parse error.
Within searches on the plane with large polygons can be significantly different than on the sphere because they follow the curvature of the sphere.
Re: edge cases: Some are documented on mongodb.org, but there are far too many to detail, so make sure to play around with your particular edge cases.
If you have a collection of documents that are all the businesses in America, filtering for type Hot Dog Stand will reduce the set of results significantly, and searching for an exact match string compare on a normal mongo index is a very quick operation, compared to a geo index search. Because of that stating the question in the first order will be much faster than stating it in the second way.Indexing multiple geo fields was not possible between 2.4, and make possible a whole suite of queries that weren’t possible before.
1st point and 2nd point define the first line.2nd point and 3rd point define the second line.So on.
$maxDistance operator is an optional operator that allows us to specify a maximum distance away from a point, which to go looking.
Tricky bitsHow do you use that index efficiently?How do you decide the size of the cells? How do you calculate thebtree key
Works by looking at concentric donuts starting from the center point.Here we are searching for pubs near a point on Leicester SquareNothing in donut 1
The porcupine is within the second donut, but although the Brewmaster is within the covering for the second donut, it isn’t actually within the donut
This continues until we have found enough points to fill a batch