Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enabling Access to Big Geospatial Data with LocationTech and Apache projects

LocationPowers OGC BigGeoData 2016

This presentation will discuss tools in the open source landscape that are used to handle big geospatial data. In particular, we will focus on how Apache frameworks such as Spark and Accumulo are "geospatially enabled" by four projects: GeoTrellis, GeoWave, GeoMesa, and GeoJinni. These four projects all participate in LocationTech, a working group under the Eclipse Foundation. In particular, we will discuss how each of these LocationTech technologies implement spatial indexing (e.g. by using space filling curves) in order to provide quick access to data, and other common themes among the four projects. Attendees should walk away from this presentation understanding important parts of the Apache big data ecosystem, a set of LocationTech projects that belong to the cutting edge of enabling those Apache project's handling of geospatial data, as well as some solutions to common problems when dealing with large geospatial data.

  • Login to see the comments

Enabling Access to Big Geospatial Data with LocationTech and Apache projects

  1. 1. Rob Emanuele ENABLING ACCESS TO BIG GEOSPATIAL DATA WITH &
  2. 2. What we’ll be covering… LocationTech projects that geospatially enable Apache big data frameworks by providing spatial indexing. Discuss how those four project approach indexing, focusing on the use of space filling curves.
  3. 3. STORING AND PROCESSING GEOSPATIAL DATA @ SCALE
  4. 4. STORING AND PROCESSING GEOSPATIAL DATA @ SCALE
  5. 5. WHAT IS ?
  6. 6. GEOJINNI (FORMERLY SPATIALHADOOP)
  7. 7. SPACE FILLING CURVES
  8. 8. 00 01 1011 10 11 00 01 11 10 00 01 Hilbert Index (52) = 11 01 00
  9. 9. Geo + accessed through
  10. 10. Z curve
  11. 11. Z curve (also XZ)
  12. 12. Geo + accessed through GEOWAVE
  13. 13. Hilbert Curve
  14. 14. Range Decomposition 70 -> 75 92 -> 99 116 -> 121
  15. 15. False positives - secondary filtering
  16. 16. Geo + Rasters +
  17. 17. Z or Hilbert
  18. 18. Data Node Data Node Data Node Name Node Master Tablet Server Tablet Server Tablet Server Accumulo BigTable clone (columnar database) Records stored on HDFS Lexicographically sorted table index
  19. 19. partition id split id
  20. 20. split id partition id
  21. 21. Tiered Indexing
  22. 22. Tiered Indexing
  23. 23. Periodicity (time dimension) 1997 1998 1999
  24. 24. Periodicity (arbitrary dimensions) Time Elevation Velocity
  25. 25. Spatial index stored per file on HDFS Z order (2D and 3D), Hilbert (N-Dimensional) Z order (2D and 3D) Binned per week for spatiotemporal N-Dimensional Hilbert with arbitrary binning and tiered indexing Spatial Indexing
  26. 26. CQL
  27. 27. Future integration work ?
  28. 28. THANK YOU @lossyrob gitter.im/geotrellis/geotrellis github.com/geotrellis/geotrellis remanuele@azavea.com
  29. 29. GeoMesa GeoWave
  30. 30. Tiered Indexing

×