2. Introduction
Creating maps and depicting the real world entities on the map with accuracy has been a challenging
problem for years.
In Go-Jek (a ride-hailing company), we are using Google map APIs to calculate the shortest distance to be
travelled by the driver and to allocate the nearest available driver to the customer booking the ride.
To create our own map and find the shortest paths, we have come up with “Columbus”.
3. Problem Statement
Creation of Map using driver GPS location pings, representing it as a directional graph and finding
the shortest path.
4. Data Collection and Data Visualisation
In Go-Jek, we receive driver location pings every 10 seconds. This enormous collection of pings is
injected into Kafka by the Driver Location Service. We have collected and processed the GPS
locations of drivers using Flink.
This collection of driver GPS locations is visualised as a layer on top of OSM (Open Street Map), this
shows the GPS points clustered along the roads.
5. Approach
A. Identify the clusters of the driver GPS
locations collection.
B. Calculate the centroid of these
clusters.
C. These centroids form the nodes of the
graph representation of map.
D. Detect the direction of the road by
analysing the history of driver location
pings.
E. Add the directional edges representing
the roads between the graph nodes.
F. Calculate the shortest path between
any two given nodes of the directional
graph which represents the map.
6. Data Challenges
• The GPS locations collection is humongous which takes significant amount of time to process.
• The accuracy of the GPS locations waver, hence the driver locations can be away from the actual road line.
• Identifying the clusters of these GPS locations after filtering the outlier locations.
• Calculating the centroid of the clusters of GPS locations and forming a graph of these centroids as nodes gives a large
interconnected mesh of nodes.
• The biggest challenge was in identifying the connections between the nodes which represent the roads and giving a
directional sense to these connections.