Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Enabling Efficient Cloud Workflows
Eugene Cheipesh
echeipesh@azavea.com
GitHub: @echeipesh
www.azavea.com
Cloud Native Workflows
“Hey, let's not download the whole file every time.”
Cloud Native
● Network is between storage and computation
● Storage is chunked and probably Key/Value
● Computation must s...
Cloud Optimized GeoTiff
● Defines an internal structure of GeoTiffs that is
optimized for streaming reading from blob
stor...
https://trac.osgeo.org/gdal/wiki/CloudOptimizedGeoTIFF
http://www.cogeo.org/
“A GeoTIFF file is a TIFF 6.0 file, and
inherits the file structure as described
in the corresponding portion of the
TIFF ...
TIFF
● Tagged Image File Format
● Binary image is separated out into
“segments”
● Contains metadata in a sequence of “tags...
GeoTiff Custom Tags
● GeoTiff adds custom tag GeoKeys
○ non-TIFF keys for CRS map space etc.
● This is how we know where t...
GeoTiff Streaming Read
● Step 1: Fetch the header
● Step 2: Decide what segments you want
● Step 3: Read the segments
Request: BBOX Mean
GeoTIFF: All of the above
COG: IFDs First - Rule #1
TIFF Segments
● Chunks of an image are stored at different
locations. This allows for parts of the image to
be decompresse...
TIFF: Strip Segments
COG: Strip Segments - Bad
TIFF: Tile Segments
COG: Use Tiled Segments - Rule #2
Request: BBOX Mean at 60m
Request: BBOX Mean at 60m
GeoTIFF: Has Overviews
COG: Provide Overviews - Rule #3
COG Rules
● Put IFDs first
● Sort IFDs by resolution
● Use Tiled segments
● Include overviews (in file)
COG Rules Continued ?
● Provenance Metadata
● Include Summary Statistics
● File naming convention
● Catalog suggestion
COG “Profiles”: Slippy Map
● Restrict CRS as EPSG:3857
● Align GeoTiff segments with slippy tile grid
● Align GeoTiff file...
COG World Coverage?
“You’re going to need more than one GeoTIFF.”
s3://[bucket]/[Shard]_[GeoHash]_[ISO_TIME]_[ID].tiff
GeoTrellis & COGs
COG: Tile Request
COG Tiler
● Application specific catalog
● Reprojection on read
● Memache gives responsiveness
● Raster metadata pushed in...
GeoTrellis Layers
GeoTrellis: Avro Layers
GeoTrellis Avro Layers
● Many small files, one per tile
○ PUT requests get expensive
● Data duplication hazzard
○ Probably...
GeoTrellis: COG Layers
GeoTrellis Strucuted COG
● Base zoom tiles are in base GeoTiff layer
● Introduces “DATE_TIME” tag
● Includes VRT for layer...
GeoTrellis: COG Pyramid
Unstructured COGs
Cloud Native thinking
GeoTrellis Unstructured COG
● Lots of good raster sources are already COG
● They don’t have a strict layout though
● Lets ...
COG: Tile Request (Again)
GeoTrellis Unstructured COG
● Requires mechanism to query for raster names
○ WFS, STAC, PostgreSQL, Scan
● COG provies “gr...
Coregistration Workflow
● Working with multiple raster layers with differing:
○ CRS, Resolution, Extent
● Must pick the wo...
Coregisteration Workflow
COGs for GeoTrellis
● Treating GeoTiff as a Key/Value tile store
○ With built in notion level of detail
● Turns an ingest ...
COGs in General
● GeoJSON of raster formats
● Gradated spec
● Set of best practices
● Driven by implementations
THANKS
Q: Is this a good idea ?
Q: So is this a standard ?
Q: Why not use MRF ?
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Cloud Optimized GeotTIFFs: enabling efficient cloud workflows
Upcoming SlideShare
Loading in …5
×

of

Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 1 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 2 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 3 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 4 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 5 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 6 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 7 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 8 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 9 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 10 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 11 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 12 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 13 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 14 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 15 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 16 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 17 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 18 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 19 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 20 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 21 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 22 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 23 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 24 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 25 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 26 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 27 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 28 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 29 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 30 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 31 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 32 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 33 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 34 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 35 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 36 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 37 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 38 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 39 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 40 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 41 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 42 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 43 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 44 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 45 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 46 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 47 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 48 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 49 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 50 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 51 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 52 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 53 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 54 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 55 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 56 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 57 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 58 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 59 Cloud Optimized GeotTIFFs: enabling efficient cloud workflows  Slide 60
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

2 Likes

Share

Download to read offline

Cloud Optimized GeotTIFFs: enabling efficient cloud workflows

Download to read offline

Small changes to the way common GeoTIFF file is written has broad implications for enabling efficient use of raster data in cloud native workflows.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Cloud Optimized GeotTIFFs: enabling efficient cloud workflows

  1. 1. Enabling Efficient Cloud Workflows
  2. 2. Eugene Cheipesh echeipesh@azavea.com GitHub: @echeipesh www.azavea.com
  3. 3. Cloud Native Workflows “Hey, let's not download the whole file every time.”
  4. 4. Cloud Native ● Network is between storage and computation ● Storage is chunked and probably Key/Value ● Computation must scale ● Coordination is limited and risky
  5. 5. Cloud Optimized GeoTiff ● Defines an internal structure of GeoTiffs that is optimized for streaming reading from blob storage like S3 or Google Cloud Storage. ● That is, it’s best utilized for requests that use HTTP GET Range Requests
  6. 6. https://trac.osgeo.org/gdal/wiki/CloudOptimizedGeoTIFF
  7. 7. http://www.cogeo.org/
  8. 8. “A GeoTIFF file is a TIFF 6.0 file, and inherits the file structure as described in the corresponding portion of the TIFF spec. All GeoTIFF specific information is encoded in several additional reserved TIFF tags, and contains no private Image File Directories (IFD's), binary structures or other private information invisible to standard TIFF readers.” http://geotiff.maptools.org/spec/geotiff2.4.html#2.4 GeoTiff
  9. 9. TIFF ● Tagged Image File Format ● Binary image is separated out into “segments” ● Contains metadata in a sequence of “tags” ● Can contain metadata for multiple images in different “image file directories” (IDFs)
  10. 10. GeoTiff Custom Tags ● GeoTiff adds custom tag GeoKeys ○ non-TIFF keys for CRS map space etc. ● This is how we know where the pixels are on the surface of the earth (the geospatial bit). http://geotiff.maptools.org/spec/geotiff6.html#6.1
  11. 11. GeoTiff Streaming Read ● Step 1: Fetch the header ● Step 2: Decide what segments you want ● Step 3: Read the segments
  12. 12. Request: BBOX Mean
  13. 13. GeoTIFF: All of the above
  14. 14. COG: IFDs First - Rule #1
  15. 15. TIFF Segments ● Chunks of an image are stored at different locations. This allows for parts of the image to be decompressed and loaded separately. ● There are two methods for segment layout: Striped and Tiled
  16. 16. TIFF: Strip Segments
  17. 17. COG: Strip Segments - Bad
  18. 18. TIFF: Tile Segments
  19. 19. COG: Use Tiled Segments - Rule #2
  20. 20. Request: BBOX Mean at 60m
  21. 21. Request: BBOX Mean at 60m
  22. 22. GeoTIFF: Has Overviews
  23. 23. COG: Provide Overviews - Rule #3
  24. 24. COG Rules ● Put IFDs first ● Sort IFDs by resolution ● Use Tiled segments ● Include overviews (in file)
  25. 25. COG Rules Continued ? ● Provenance Metadata ● Include Summary Statistics ● File naming convention ● Catalog suggestion
  26. 26. COG “Profiles”: Slippy Map ● Restrict CRS as EPSG:3857 ● Align GeoTiff segments with slippy tile grid ● Align GeoTiff files with tile boundaries ● Serve tile endpoints directly from COG
  27. 27. COG World Coverage? “You’re going to need more than one GeoTIFF.”
  28. 28. s3://[bucket]/[Shard]_[GeoHash]_[ISO_TIME]_[ID].tiff
  29. 29. GeoTrellis & COGs
  30. 30. COG: Tile Request
  31. 31. COG Tiler ● Application specific catalog ● Reprojection on read ● Memache gives responsiveness ● Raster metadata pushed into files
  32. 32. GeoTrellis Layers
  33. 33. GeoTrellis: Avro Layers
  34. 34. GeoTrellis Avro Layers ● Many small files, one per tile ○ PUT requests get expensive ● Data duplication hazzard ○ Probably not authoritative ● Limited access ○ Requires GeoTrellis code to read
  35. 35. GeoTrellis: COG Layers
  36. 36. GeoTrellis Strucuted COG ● Base zoom tiles are in base GeoTiff layer ● Introduces “DATE_TIME” tag ● Includes VRT for layer files ● GeoTiff segments match tile size and layout ● Additional zoom levels stored in overviews ● Pyramid up when overviews “run out”
  37. 37. GeoTrellis: COG Pyramid
  38. 38. Unstructured COGs Cloud Native thinking
  39. 39. GeoTrellis Unstructured COG ● Lots of good raster sources are already COG ● They don’t have a strict layout though ● Lets “inline” ingest process into the batch job ● Slower but reduces data duplication
  40. 40. COG: Tile Request (Again)
  41. 41. GeoTrellis Unstructured COG ● Requires mechanism to query for raster names ○ WFS, STAC, PostgreSQL, Scan ● COG provies “gradual” wins over GeoTiff ● Reproject on read to match workflow
  42. 42. Coregistration Workflow ● Working with multiple raster layers with differing: ○ CRS, Resolution, Extent ● Must pick the working/target layout ● We can “push-down” this process into IO
  43. 43. Coregisteration Workflow
  44. 44. COGs for GeoTrellis ● Treating GeoTiff as a Key/Value tile store ○ With built in notion level of detail ● Turns an ingest workflow into query workflow ○ Use the metadata to plan the read ● Opens up the results ○ QGIS, GeoServer, GDAL
  45. 45. COGs in General ● GeoJSON of raster formats ● Gradated spec ● Set of best practices ● Driven by implementations
  46. 46. THANKS
  47. 47. Q: Is this a good idea ?
  48. 48. Q: So is this a standard ?
  49. 49. Q: Why not use MRF ?
  • ssuserc32eed1

    Aug. 15, 2019
  • wsh6759

    Apr. 1, 2019

Small changes to the way common GeoTIFF file is written has broad implications for enabling efficient use of raster data in cloud native workflows.

Views

Total views

5,611

On Slideshare

0

From embeds

0

Number of embeds

2,811

Actions

Downloads

33

Shares

0

Comments

0

Likes

2

×