GeoMesa is an open-source toolkit for processing and analyzing spatio-temporal data, such as IoT and sensor-produced observations, at scale. It provides a consistent API for querying and analyzing data on top of distributed databases (e.g. HBase, Accumulo, Bigtable, Cassandra) and messaging networks (e.g. Kafka) to handle batch analysis of historical archives of data and low-latency processing of data in-stream.
GeoMesa has deep integration with Spark SQL. It has added spatial types (e.g. Point, LineString, Polygons), spatial predicates (st_contains, st_intersects, etc.), and geometry processing functions (e.g. st_buffer, st_convexHull, etc.) to Spark SQL. It also optimizes the processing of these extensions by integrating with the Catalyst SQL optimizer to intercept SQL statements with spatial predicates and provision RDDs based on the underlying spatial index.
This session will describe the implementation of the GeoMesa Spark SQL integration, illustrate its application in production systems and demonstrate spatial aggregations and analytics using map-based visualizations.