www.luxoft.com
Server-side Geo Clustering Based
on Geohash
Evgeniy Khyst
02.06.2016
www.luxoft.com
Geo Clustering
WHEN TOO MANY GEO-OBJECTS (POINTS, MARKERS,
PLACEMARKS) ARE CLUSTERED TOGETHER ON A MAP, THEY
MERGE INTO ONE BIG BARELY DISTINGUISHABLE BLOB.
www.luxoft.com
Geo Clustering
www.luxoft.com
Geo Clustering
MULTIPLE GEO-OBJECT COORDINATES AND OTHER DATA USE UP
LARGE AMOUNTS OF MEMORY, WHILE THE MAP DISPLAY
CONSUMES A LOT OF HARDWARE RESOURCES, WHICH CAN
CAUSE APPLICATIONS TO HANG.
www.luxoft.com
Geo Clustering
● The standard solution to this problem is to group objects
located near one another together and represent them using a
special icon.
● A cluster icon usually specifies the number of objects it
contains, and users can zoom in to see the individual points in a
cluster.
● Clustering can increase performance dramatically when
displaying large numbers of geo-objects.
www.luxoft.com
Geo Clustering
www.luxoft.com
Client-side Geo Clustering
● Many JavaScript libraries for interactive maps provide client-
side clustering capabilities.
● With client-side clustering, individual points are retrieved from
the server and then processed in the browser or mobile app to
create clusters.
www.luxoft.com
Client-side vs. Server-side Geo Clustering
✖Disadvantage of client-side clustering is huge response
payload, taking up time and memory on the client’s side.
✔Advantage of server-side clustering is much smaller response
payload: a few clusters, versus thousands of geo points. It is
faster and consumes less memory for the client.
www.luxoft.com
Geohash
● Geohash is an alphanumeric string representation of latitudinal
and longitudinal coordinates.
● A point with latitude of 50.450101 and a
longitude of 30.523401 is represented by the
Geohash string u8vxn84mnu3q.
www.luxoft.com
Geohash
● Removing characters from the end of the Geohash string will
result in losing some of the precision of the coordinates.
● Geohash u8vxn84mnu decodes to the coordinates
50.45010 and 30.5234, while Geohash u8vxn8 will decode to
50.45 and 30.5.
www.luxoft.com
Geohash
● Points that share the same prefix are located nearby.
● Geohash u8vxn84mnu decodes to the coordinates
50.45010 and 30.5234, and coordinates 50 kilometers away,
50.348751 and 30.90151, are encoded to u8vyrjty9r7y.
www.luxoft.com
Geohash
This allows to easily search for nearby locations.
For example, using SQL:
SELECT * FROM GEO_POINT WHERE GEOHASH LIKE ‘u8v%’
www.luxoft.com
Geohash Encoding
TO ENCODE THE LATITUDE AND LONGITUDE OF COORDINATES,
GEOHASH DIVIDES THE MAP INTO A GRID THAT BUCKETS
NEARBY POINTS TOGETHER.
www.luxoft.com
Geohash Encoding
www.luxoft.com
Geohash Encoding
www.luxoft.com
Geohash Encoding
www.luxoft.com
Geohash Encoding
www.luxoft.com
Geohash Encoding
www.luxoft.com
Geohash Encoding
Geohash binary values are represented by base-32 encoded
strings. Each five bit Geohash value is converted to a character
using a character map:
www.luxoft.com
Server-side Geo Clustering Based on Geohash
● Geohash grids the world into cells, with each representing a
single cluster.
● The Geohash prefix length is directly related to the zoom
resolution.
● For better visualization, all the points in a cell can be averaged
and the resulting coordinates are where the cluster is located in
the cell, versus placing the cluster directly in the center of each
cell.
www.luxoft.com
Server-side Geo Clustering Based on Geohash
CREATE TABLE GEO_POINT (
GEO_POINT_ID SERIAL PRIMARY KEY,
LATITUDE_DEG FLOAT8 NOT NULL,
LONGITUDE_DEG FLOAT8 NOT NULL,
GEOHASH VARCHAR(12) NOT NULL,
COUNTRY_CODE VARCHAR(2)
);
CREATE INDEX I_GEO_POI_LAT_LON ON
GEO_POINT (LATITUDE_DEG, LONGITUDE_DEG);
CREATE INDEX I_GEO_POI_GEOHASH ON GEO_POINT (GEOHASH);
www.luxoft.com
Server-side Geo Clustering Based on Geohash
SELECT AVG(GP.LATITUDE_DEG) AS LATITUDE_DEG,
AVG(GP.LONGITUDE_DEG) AS LONGITUDE_DEG,
COUNT(*) AS QUANTITY,
SUBSTRING(GP.GEOHASH FROM 1 FOR :precision) AS GEOHASH_PREFIX,
GP.COUNTRY_CODE AS COUNTRY_CODE
FROM GEO_POINT GP
WHERE GP.LATITUDE_DEG BETWEEN :south_west_lat AND :north_east_lat
AND GP.LONGITUDE_DEG BETWEEN :south_west_lon AND :north_east_lon
GROUP BY GEOHASH_PREFIX, COUNTRY_CODE
● south_west_lat/south_west_lon - latitude/longitude of the bottom left point of the
viewport bounding box
● north_east_lat/north_east_lon - latitude/longitude of the top right point of the viewport
bounding box
● precision - Geohash character length is directly related to cluster size. The precision value
depends on the distance between south-west and north-east points.
www.luxoft.com
Server-side Geo Clustering Based on Geohash
● This query will return the coordinates of geo-point clusters and
the number of geo-objects in each cluster.
● Geopoints are grouped in clusters by Geohash prefix and
country.
● Geo-points that are close together but in different countries are
not grouped together even though they share the same
Geohash prefix. For this reason, the COUNTRY_CODE column is
included in the GROUP BY clause.
www.luxoft.com
Server-side Geo Clustering Based on Geohash
● When we zoom in and out on a map, the Geohash prefix
changes accordingly.
● Geohash prefix length depends on the zoom resolution.
● Let’s define the function y=f(x) as the relation between Geohash
prefix length and zoom.
● The function is exponential, y=aebx, rather than linear y=kx + b.
www.luxoft.com
Server-side Geo Clustering Based on Geohash
www.luxoft.com
Server-side Geo Clustering Based on Geohash
x – zoom, y – Geohash prefix length,
gmin – Geohash prefix minimum length, gmax – Geohash prefix maximum length,
zmin – minimum zoom, zmax – maximum zoom.
www.luxoft.com
Server-side Geo Clustering Based on Geohash
x – zoom, y – Geohash prefix length,
gmin – Geohash prefix minimum length, gmax – Geohash prefix maximum length,
zmin – minimum zoom, zmax – maximum zoom.
www.luxoft.com
Server-side Geo Clustering Based on Geohash
www.luxoft.com
Server-side Geo Clustering Based on Geohash
www.luxoft.com
Server-side Geo Clustering Based on Geohash
www.luxoft.com
Server-side Geo Clustering Based on Geohash
www.luxoft.com
Server-side Geo Clustering Based on Geohash
AN INTERACTIVE EXAMPLE IS AVAILABLE AT
HTTP://GEOHASH-EVGENIYKHIST.RHCLOUD.COM/
THE SOURCE CODE FOR THE EXAMPLE IS AVAILABLE ON GITHUB:
HTTPS://GITHUB.COM/EVGENIY-KHIST/GEOHASH-EXAMPLE/
www.luxoft.com
THANK YOU

Евгений Хыст: "Server-Side Geo-Clustering Based on Geohash"

  • 1.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash Evgeniy Khyst 02.06.2016
  • 2.
    www.luxoft.com Geo Clustering WHEN TOOMANY GEO-OBJECTS (POINTS, MARKERS, PLACEMARKS) ARE CLUSTERED TOGETHER ON A MAP, THEY MERGE INTO ONE BIG BARELY DISTINGUISHABLE BLOB.
  • 3.
  • 4.
    www.luxoft.com Geo Clustering MULTIPLE GEO-OBJECTCOORDINATES AND OTHER DATA USE UP LARGE AMOUNTS OF MEMORY, WHILE THE MAP DISPLAY CONSUMES A LOT OF HARDWARE RESOURCES, WHICH CAN CAUSE APPLICATIONS TO HANG.
  • 5.
    www.luxoft.com Geo Clustering ● Thestandard solution to this problem is to group objects located near one another together and represent them using a special icon. ● A cluster icon usually specifies the number of objects it contains, and users can zoom in to see the individual points in a cluster. ● Clustering can increase performance dramatically when displaying large numbers of geo-objects.
  • 6.
  • 7.
    www.luxoft.com Client-side Geo Clustering ●Many JavaScript libraries for interactive maps provide client- side clustering capabilities. ● With client-side clustering, individual points are retrieved from the server and then processed in the browser or mobile app to create clusters.
  • 8.
    www.luxoft.com Client-side vs. Server-sideGeo Clustering ✖Disadvantage of client-side clustering is huge response payload, taking up time and memory on the client’s side. ✔Advantage of server-side clustering is much smaller response payload: a few clusters, versus thousands of geo points. It is faster and consumes less memory for the client.
  • 9.
    www.luxoft.com Geohash ● Geohash isan alphanumeric string representation of latitudinal and longitudinal coordinates. ● A point with latitude of 50.450101 and a longitude of 30.523401 is represented by the Geohash string u8vxn84mnu3q.
  • 10.
    www.luxoft.com Geohash ● Removing charactersfrom the end of the Geohash string will result in losing some of the precision of the coordinates. ● Geohash u8vxn84mnu decodes to the coordinates 50.45010 and 30.5234, while Geohash u8vxn8 will decode to 50.45 and 30.5.
  • 11.
    www.luxoft.com Geohash ● Points thatshare the same prefix are located nearby. ● Geohash u8vxn84mnu decodes to the coordinates 50.45010 and 30.5234, and coordinates 50 kilometers away, 50.348751 and 30.90151, are encoded to u8vyrjty9r7y.
  • 12.
    www.luxoft.com Geohash This allows toeasily search for nearby locations. For example, using SQL: SELECT * FROM GEO_POINT WHERE GEOHASH LIKE ‘u8v%’
  • 13.
    www.luxoft.com Geohash Encoding TO ENCODETHE LATITUDE AND LONGITUDE OF COORDINATES, GEOHASH DIVIDES THE MAP INTO A GRID THAT BUCKETS NEARBY POINTS TOGETHER.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
    www.luxoft.com Geohash Encoding Geohash binaryvalues are represented by base-32 encoded strings. Each five bit Geohash value is converted to a character using a character map:
  • 20.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash ● Geohash grids the world into cells, with each representing a single cluster. ● The Geohash prefix length is directly related to the zoom resolution. ● For better visualization, all the points in a cell can be averaged and the resulting coordinates are where the cluster is located in the cell, versus placing the cluster directly in the center of each cell.
  • 21.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash CREATE TABLE GEO_POINT ( GEO_POINT_ID SERIAL PRIMARY KEY, LATITUDE_DEG FLOAT8 NOT NULL, LONGITUDE_DEG FLOAT8 NOT NULL, GEOHASH VARCHAR(12) NOT NULL, COUNTRY_CODE VARCHAR(2) ); CREATE INDEX I_GEO_POI_LAT_LON ON GEO_POINT (LATITUDE_DEG, LONGITUDE_DEG); CREATE INDEX I_GEO_POI_GEOHASH ON GEO_POINT (GEOHASH);
  • 22.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash SELECT AVG(GP.LATITUDE_DEG) AS LATITUDE_DEG, AVG(GP.LONGITUDE_DEG) AS LONGITUDE_DEG, COUNT(*) AS QUANTITY, SUBSTRING(GP.GEOHASH FROM 1 FOR :precision) AS GEOHASH_PREFIX, GP.COUNTRY_CODE AS COUNTRY_CODE FROM GEO_POINT GP WHERE GP.LATITUDE_DEG BETWEEN :south_west_lat AND :north_east_lat AND GP.LONGITUDE_DEG BETWEEN :south_west_lon AND :north_east_lon GROUP BY GEOHASH_PREFIX, COUNTRY_CODE ● south_west_lat/south_west_lon - latitude/longitude of the bottom left point of the viewport bounding box ● north_east_lat/north_east_lon - latitude/longitude of the top right point of the viewport bounding box ● precision - Geohash character length is directly related to cluster size. The precision value depends on the distance between south-west and north-east points.
  • 23.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash ● This query will return the coordinates of geo-point clusters and the number of geo-objects in each cluster. ● Geopoints are grouped in clusters by Geohash prefix and country. ● Geo-points that are close together but in different countries are not grouped together even though they share the same Geohash prefix. For this reason, the COUNTRY_CODE column is included in the GROUP BY clause.
  • 24.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash ● When we zoom in and out on a map, the Geohash prefix changes accordingly. ● Geohash prefix length depends on the zoom resolution. ● Let’s define the function y=f(x) as the relation between Geohash prefix length and zoom. ● The function is exponential, y=aebx, rather than linear y=kx + b.
  • 25.
  • 26.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash x – zoom, y – Geohash prefix length, gmin – Geohash prefix minimum length, gmax – Geohash prefix maximum length, zmin – minimum zoom, zmax – maximum zoom.
  • 27.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash x – zoom, y – Geohash prefix length, gmin – Geohash prefix minimum length, gmax – Geohash prefix maximum length, zmin – minimum zoom, zmax – maximum zoom.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
    www.luxoft.com Server-side Geo ClusteringBased on Geohash AN INTERACTIVE EXAMPLE IS AVAILABLE AT HTTP://GEOHASH-EVGENIYKHIST.RHCLOUD.COM/ THE SOURCE CODE FOR THE EXAMPLE IS AVAILABLE ON GITHUB: HTTPS://GITHUB.COM/EVGENIY-KHIST/GEOHASH-EXAMPLE/
  • 33.