This presentation will discuss the successes of GeoWave applied to the spatiotemporal domain, and focus on how the successes in this domain can be further generalized to a diverse set of complex data structures. The intent is to draw corollaries to the data challenges of the audience.
Fast indexed access to massive datasets fundamentally involves highly optimized range scans within key-value stores. If your reaction is "that's easier said than done" than you've had the pre-requisite experiences to attend this talk. The intent of the software is to make these use cases as seamless as possible for downstream consumers of the framework. Briefly, a GeoWave "dimension" is simply a function to apply sort order to real world values. The constructs for defining these "dimensions" and many more details will be discussed in this presentation.
At the core of GeoWave is a capability to store, retrieve, and analyze multi-dimensional data structures within distributed key-value stores. Fundamentally, spatio-temporal data serves as a special case for which GeoWave provides tailored extensions. The software is intended to be easily pluggable into any sorted key-value store, with current implementations available for Apache HBase, Apache Accumulo, Apache Cassandra, Apache Kudu, Redis, RocksDB, Google BigTable, and Amazon DynamoDB. The datastore support is truly provided as an extension that is discoverable at runtime. Following any GeoWave programmatic API, commandline, or service access will not be tied to any particular key-value store. Furthermore there are optimized data transfer utilities across supported stores. This approach has proven to provide seamless transitions of scale from embedded applications, external in-memory services, all the way up to its primary applications within highly distributed ecosystems.