The document discusses efficient exploitation of remote sensing data. It summarizes Grega Milcinksi's presentation on Sentinel Hub, a platform for accessing and processing satellite data. It notes that large volumes of remote sensing data are created daily but pre-processing into "data cubes" limits flexibility. The document recommends processing data on-demand using cloud computing. Sentinel Hub is highlighted as an example that provides open data access through APIs and applications using AWS services. It processes over 50 million requests per month from various data sources. The document concludes that public data should be openly available in the cloud and reasonable business models are needed for commercial data.
3. About Sinergise
• 50 person small business
• Large geospatial applications
• Depends on own cash flow
• Sentinel Hub, Sentinel Playground, EO Browser
6. “Data cubes” options
• Pre-processed images
•“Just images”
•Fixed composite settings (true color, bands 721, ...)
•Fixed projections (Geographic, Web Mercator, Polar)
•Fixed processing steps (lack of flexibility)
•Storage and compute (regardless of the use)
• Pre-processed tiles
•Storage and compute
•Some limitations remain (lack of flexibility)
7. Recommendations
• Only process to the level widely supported by the community
• COG is a data cube (of some sort)
•JPEG2000 and HDF as well
• Further steps can be executed on-the-fly
•With the help of fast storage, on-demand compute
•Decompression, reprojection, filtering, mosaicking, compositing, analysis,
output encoding, etc.
•Not good for everything... e.g. mass processing
8. WMS
Commercial EO data
Aerial imagery (drone, plane)
Other raster data
Open EO data - Sentinel-1, Sentinel-2,
Landsat, etc.
WMTS
Machine learning
API
WCS
Scripting
(R, Python, ENVI…)
Web / Mobile apps
Desktop (QGIS, ArcGIS…)
Cloud GIS
15. Supported Data sources
• Currently available
Sentinel-1 GRD (global archive since 1/5/17)
Sentinel-2 (full global archive)
Sentinel-3 (full global archive)
Landsat-8 USGS (global archive)
Landsat-5, 7, 8 (ESA Archive)
Envisat MERIS (full global archive)
MODIS Terra and Aqua
DEM – SRTM30
Planet and RapidEye (limited due to business models)
• Up to date!
17. System stats
• 50 Million requests per month (May 2018, growing 10-20% per month)
• 0.53 Bn data access requests (cca 50 PB of data)
• 5.5 TB data transfer out
18. System design
AWS Elastic Load Balancer
Layer7 Load Balancers
User
(2x m3.medium)
Data processor
c5.2xlarge
• 8x on demand
• up to 10 spot
Catalogue
Configuration
2x m5.large
Index
10TB
S3
Data
3PB
OGCLB
DB
1x m4.large
2x i3.xlarge
19.
20. System design
AWS Elastic Load Balancer
Layer7 Load Balancers
User
(2x m3.medium)
c5.2xlarge
• 8x on demand
• up to 10 spot
Catalogue &
Configuration
2x m5.large
Index
10TB
S3
Data
3PB
LB
DB
1x m4.large
2x i3.xlarge
OGC
Stat renderer
AWS lambda
Data processor
22. Lessons learned - infrastructure
• As little storage as needed
• Build system close to the data (performance + costs)
• Requester pays is still an OK compromise
• SPOTs are cheap (3.5x cheaper than on-demand) yet volatile
• S3 is really fast (bucket sharing makes it faster)
• Many workers have better throughput to S3
• Lambda is not good for everything
23. WMS
Commercial EO data – WorldWind, GeoEye,...
Aerial imagery (drone, plane)
Other raster data
Open EO data - Sentinel-1, Sentinel-2,
Landsat, etc.
WMTS
Machine learning
API
WCS
Scripting
(R, Python, ENVI…)
Web / Mobile apps
Desktop (QGIS,, ArcGIS…)
Cloud GIS
37. Lessons learned and remaining challenges
• Public datasets are really cool
• With commercial providers minimum order is often the issue, not price per
sq.km (pay per use, revenue sharing)
• Data providers are not as consistent as one would like
• Redesign of existing algorithms made for whole tiles
• Setting up the system is just the beginning of the work
38. Summary
• For public institutions creating remote sensing data
•Put them in the cloud
•Make them openly and directly available
•Use COG or similar
•Data portals vs. Open dataset
• For commercial entities creating remote sensing data:
•Put them in the cloud
•Make them directly available under reasonable business model
•Use COG or similar
39. More info
• http://sentinel-hub.com/
• http://apps.sentinel-hub.com/eo-browser/
• http://apps.sentinel-hub.com/sentinel-playground/
• https://sentinel-hub.github.io/custom-scripts/
• https://github.com/sentinel-hub
• https://github.com/sentinel-hub/eo-learn/
Thanks