GeoServer on Steroids at FOSS4G Europe 2014

3,803 views

Published on

Setting up a GeoServer can sometimes be deceptively simple. However, going from proof of concept to production requires a number of steps to be taken in order to optimize the server in terms of availability, performance and scalability.
The presentation will show how to get from a basic setup to a battle ready, rock solid installation by showing the ropes an advanced user already mastered.

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,803
On SlideShare
0
From Embeds
0
Number of Embeds
96
Actions
Shares
0
Downloads
121
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

GeoServer on Steroids at FOSS4G Europe 2014

  1. 1. GeoServer on steroids All you wanted to know about how to make GeoServer faster but you never asked (or you did and no one answered) Ing. Andrea Aime, GeoSolutions Ing. Simone Giannecchini, GeoSolutions FOSS4G-Europe 2011, Bremen 14th -17th July 2014
  2. 2. GeoSolutions  Founded in Italy in late 2006  Expertise • Image Processing, GeoSpatial Data Fusion • Java, Java Enterprise, C++, Python • JPEG2000, JPIP, Advanced 2D visualization  Supporting/Developing FOSS4G  MapStore, GeoServer  GeoNetwork, GeoBatch  Clients  Public Agencies  Private Companies  http://www.geo-solutions.it FOSS4G Europe 2014, Bremen 14th-17th July 2014
  3. 3. Summary  Contents  Preparing raster data and Preparing vector data  Optimizing styling  Output tuning  Tiling and caching  Resource control  Deploy configurations  Two slide decks in a row  A “short” one, fitting the presentation time  A detailed one following, 90+ slides worth of details FOSS4G Europe 2014, Bremen 14th-17th July 2014
  4. 4. Preparing raster inputs
  5. 5. Raster Data CheckList  Objectives  Fast extraction of a subset of the data  Fast extraction of the desired resolution  Check-list  Avoid having to open a large number of files per request  Avoid parsing of complex structures and slow compressions  Get to know your bottlenecks  CPU vs Disk Access Time vs Memory  Experiment with  Format, compression, different color models, tile size, overviews, GeoServer configuration FOSS4G Europe 2014, Bremen 14th-17th July 2014
  6. 6. Peculiar Formats  PNG/JPEG direct serving  Bad formats (especially in Java)  No tiling (or rarely supported)  Chew a lot of memory and CPU for decompression  Mitigate with external overviews  Any input ASCII format (GML grid, ASCII grid)  JPEG2000  Extensible and rich, not (always) fast, can be difficult to tune for performance (might require specific encoding options)  MrSID (can work, needs tuning)  ECW (licensing issues) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  7. 7. GeoTiff for the win  To remember: GeoTiff is a swiss knife  But you don’t want to cut a tree with it!  Tremendously flexible, good for for most (not all) use cases  BigTiff pushes the GeoTiff limits farther  Use GeoTiff when  Overviews and Tiling stay within 4GB  No additional dimensions  Consider BigTiff for very large file (> 4 GB)  Support for tiling  Support for Overviews  Can be inefficient with very large files + small tiling FOSS4G Europe 2014, Bremen 14th-17th July 2014
  8. 8. GeoTiff preparation  (Optional) Use gdal_warp to transform the data in the most used output reference system (mind, any reprojection ruins the data a bit)  Use gdal_translate to add inner tiling and fix eventual issues with coordinate reference system  Add compression options if you disks are small/slow/not local (consider JPEG compression with YCBCR color interpretation for photos, LZW/Deflate for scientific data)  Use gdaladdo to add internal overviews (remember to replicate compression here) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  9. 9. Possible structures Single GeoTiff with internal tiling and overviews (GeoTiff < 2GB, BigTiff < 20-50GB) Mosaic of GeoTiff, each one with internal tiling and overviews (< 500GB, not too many files) Image pyramid, each level has lower resolution than the previous, each level has tiles with inner tiling but no overviews (also possible to mix, less levels and use internal overviews) > 500GB FOSS4G Europe 2014, Bremen 14th-17th July 2014
  10. 10. Choosing Formats and Layouts  Use ImageMosaic when:  A single file gets too big (inefficient seeks, too much metadata to read, etc..)  Multiple Dimensions (time, elevation, others..)  Avoid mosaics made of many very small files  Single granules can be large  Use Tiling + Overviews + Compression on granules  Use ImagePyramid when:  Tremendously large dataset  Too many files / too large files  Need to serve at all scales  Especially low resolution  For single granules (< 2Gb) GeoTiff is generally a good fit FOSS4G Europe 2014, Bremen 14th-17th July 2014
  11. 11. Proper Mosaic Preparation  Optimize files as if you were serving them individually  Keep a balance between number and dimensions of granules  If memory is scarce:  USE_JAI_IMAGREAD to true  USE_MULTITHREADING to false*  Otherwise  USE_JAI_IMAGREAD to false  ALLOW_MULTITHREADING to true (Load data from different granules in parallel) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  12. 12. Multidimensional mosaics  Use Cases:  MetOc data (support for time, elevation)  Data with additional indipendent dimensions  Suggestions  Use ImageMosaic  Use a DBMS for indexing granules  Use File Name based property collectors to turn properties into DB rows attributes  Filter by time, elevation and other attributes via OGC and CQL filters  Check detailed deck for details! FOSS4G Europe 2014, Bremen 14th-17th July 2014
  13. 13. Multidimensional mosaics FOSS4G Europe 2014, Bremen 14th-17th July 2014
  14. 14. Proper Pyramid Preparation  Use gdal_retile for creating the pyramid  Prepare the list of tiles to be retiled  Create the pyramid with GDAL retile (grab a coffee!)  Chunks should not be too small (here 2048x2048), if you go larger, use also inner tiling  If the input dataset is huge use the useDirForEachRow option  Too many files in a dir is bad practice  Make sure the number of level is consistent with usage  Too few  bad performance at high scale  GeoServer congif  same as mosaic FOSS4G Europe 2014, Bremen 14th-17th July 2014
  15. 15. Proper GeoServer Coverage Options Configuration  Make sure native JAI is installed  Install the TurboJPEG extension  Enable JAI Mosaicking native acceleration  Give JAI enough memory  Don’t raise JAI memory Threshold too high  Rule of thumb: use 2 X #Core Tile Threads (check next slide)  Play with tile Recycling against your workflows (might help, might not) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  16. 16. Proper GeoServer Coverage Options Configuration  Multithreaded Granule Loading  Allows to fine tuning multithreading for ImageMosaic  Orthogonal to JAI Tile Threads  Rule of Thumb: use 2 X #Core Tile Threads  Perform testing to fine tune depending on layer configuration as well as on typical requests  ImageIO Cache threshold  decide when we switch to disk cache (very large WCS requests) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  17. 17. Preparing vector inputs
  18. 18. Vector data checklikst  What do we want from vector data:  Binary data  No complex parsing of data structures  Fast extraction of a geographic subset  Fast filtering on the most commonly used attributes FOSS4G Europe 2014, Bremen 14th-17th July 2014
  19. 19. Choosing a format  Slow formats  WFS  GML  DXF  Good formats, local and indexable  Shapefile  Directory of shapefiles  SDE  Spatial databases: PostGIS, Oracle Spatial, DB2, MySQL*, SQL server* FOSS4G Europe 2014, Bremen 14th-17th July 2014
  20. 20. DBMS Checklist  Rich support for complex native filters  Use connection pooling  Validate connections (with proper pooling)  Table Clustering  Spatial Indexing  Spatial Indexing  Spatial Indexing  Alphanumeric Indexing  Alphanumeric Indexing  Alphanumeric Indexing  Did we mention indexes? FOSS4G Europe 2014, Bremen 14th-17th July 2014
  21. 21. Shapefile preparation  Remove .qix file if present  If there are large DBF attributes that are not in use, get rid of them using ogr2ogr, e.g.: ogr2ogr -select FULLNAME,MTFCC arealm.shp tl_2010_08013_arealm.shp  If on Linux, enable memory mapping, faster, more scalable (but will kill Windows): FOSS4G Europe 2014, Bremen 14th-17th July 2014
  22. 22. Shapefile filtering  Stuck with shapefiles and have scale dependent rules like the following?  Show highways first  Show all streets when zoomed in  Use ogr2ogr to build two shapefiles, one with just the highways, one with everything, and build two layers, e.g.: ogr2ogr -sql "SELECT * FROM tl_2010_08013_roads WHERE MTFCC in ('S1100', 'S1200')" primaryRoads.shp tl_2010_08013_roads.shp  Or hire us to develop non-spatial indexing for shapefile! FOSS4G Europe 2014, Bremen 14th-17th July 2014
  23. 23. PostGIS specific hints  PostgreSQL out of the box configured for very small hardware: http://wiki.postgresql.org/wiki/Performance_Optimization  Make sure to run ANALYZE after data imports (updates optimizer stats)  As usual, avoid large joins in SQL views, consider materialized views  If the dataset is massive, CLUSTER on the spatial index:  http://postgis.refractions.net/documentation/manual- 1.3/ch05.html FOSS4G Europe 2014, Bremen 14th-17th July 2014
  24. 24. PostGIS specific hints  Careful with prepared statements (bad performances)  USE CASE: the layer’s style allows to display the whole layer in a single shot (no scale dependencies)  prepared statements will slow down execution  EXPLANATION: Postgis will choose to use the spatial index in all cases, this makes retrieving the full data set 2-4 times slower than when using a sequential scan  COUNTERMEASURE: Not using prepared statement allows postgis to figure out a suitable plan based on the request bbox instead (assuming someone run "vacuum analyze" on the database to update the index statistics, and of course, provided there is a spatial index to start with) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  25. 25. Connection Pooling Tricks FOSS4G Europe 2014, Bremen 14th-17th July 2014  Connection pool size should be proportional to the number of concurrent requests you want to serve (obvious no?)  Activate connection validation  Mind networking tools that might cut connections sitting idle (yes, your server is not always busy), they might cut the connection in “bad” ways (10 minutes timeout before the pool realizes the TCP connection attempt gives up)  Read more at http://geoserver.geo- solutions.it/edu/en/adv_gsconfig/db_pooling.html
  26. 26. Optimize styling
  27. 27. Use scale dependencies  Never show too much data  the map should be readable, not a graphic blob. Rule of thumb: 1000 features max in the display, have labels show up when zoomed in  Show details as you zoom in  Eagerly add MinScaleDenominator to your SLD rules  Add more expensive rendering when there are less features  Key to get both a good looking and fast map FOSS4G Europe 2014, Bremen 14th-17th July 2014
  28. 28. Example FOSS4G Europe 2014, Bremen 14th-17th July 2014
  29. 29. Labeling  Labeling conflict resolution is expensive, limit to the most inner zooms  Halo is important for readability, but adds significant overhead  Careful with maxDisplacement, makes for various label location attempts FOSS4G Europe 2014, Bremen 14th-17th July 2014
  30. 30. FeatureTypeStyle  GeoServer uses SLD FeatureTypeStyle objects as Z layers for painting  Each one allocates its own rendering surface (which can use a lot of memory), use as few as possible FOSS4G Europe 2014, Bremen 14th-17th July 2014
  31. 31. Use translucency sparingly  Translucent display is expensive, use it sparingly  e.g. translucent fill <CssParameter name="fill-opacity">0.5</CssParameter> FOSS4G Europe 2014, Bremen 14th-17th July 2014
  32. 32. Tiling & Caching
  33. 33. Tile caching with GeoWebCache  Tile oriented maps, fixed zoom levels and fixed grid  Useful for stable layers, backgrounds  Protocols: WMTS, TMS, WMS-C, Google Maps/Earth, VE  Speedup compared to dynamic WMS: 10 to 100 times, assuming tiles are already cached (whole layer pre- seeded)  Suitable for:  Mostly static layer  No/few dynamic parameters (CQL filters, SLD params, SQL query params, time/elevation, format options) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  34. 34. Embedded GWC advantage  No double encoding when using meta-tiling, faster seeding FOSS4G Europe 2014, Bremen 14th-17th July 2014
  35. 35. Space considerations  Seeding Colorado, assuming 8 cores, one layer, 0.1 sec 756x756 metatile, 15KB for each tile  Do yours: http://tinyurl.com/3apkpss  Not enough disk space? Set a disk quota Zoom level Tile count Size (MB) Time to seed (hours) Time to seed (days) 13 58,377 1 0 0 14 232,870 4 0 0 15 929,475 14 0 0 16 3,713,893 57 1 0 17 14,855,572 227 6 0 18 59,396,070 906 23 1 19 237,584,280 3,625 92 4 20 950,273,037 14,500 367 15 FOSS4G Europe 2014, Bremen 14th-17th July 2014
  36. 36. More Tweaks  Client-side caching of tiles  Does not work with browsers in private mode <expireClientsList> <expirationRule minZoom="0" expiration="7200" /> <expirationRule minZoom="10" expiration="600" /> </expireClientsList> FOSS4G Europe 2014, Bremen 14th-17th July 2014
  37. 37. More Tweaks  Use the right formats  JPEG for background data (e.g. ortos)  PNG8 + precomputed palette for background data (e.g. ortos)  PNG8 full for overlays with transparency  The format impacts also the disk space needed! (as well as the generation time)  Check this blog post FOSS4G Europe 2014, Bremen 14th-17th July 2014
  38. 38. Resource control
  39. 39. Resource Limits  Limit the amount of resources dedicated to an individual request  Improve fairness between requests, by preventing individual requests from hijacking the server and/or running for a very long time  EXTREMELY IMPORTANT in production environment  WHEN TO TWEAK THEM?  Frequent OOM Errors despite plenty of RAM  Requests that keep running for a long time (e.g. CPU usage peaks even if no requests are being sent)  DB Connection being killed by the DBMS while in usage (ok, you might also need to talk to the DBA..) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  40. 40. WMS request limits  Max memory per request: avoid large requests, allows to size the server memory (max concurrent request * max memory)  Max time per request: avoid requests taking too much time (e.g., using a custom style provided with dynamic SLD in the request)  Max errors: best effort renderer, but handling errors takes time FOSS4G Europe 2014, Bremen 14th-17th July 2014
  41. 41. WFS request limits  Max feature returned, configured as a global limit  Return feature bbox: reduce amount of generated GML  Per layer max feature count FOSS4G Europe 2014, Bremen 14th-17th July 2014
  42. 42. WCS request limits FOSS4G Europe 2014, Bremen 14th-17th July 2014
  43. 43. Control flow  Control how many requests are executed in parallel, queue others:  Increase throughput  Control memory usage  Enforce fairness  More info here FOSS4G Europe 2014, Bremen 14th-17th July 2014
  44. 44. $GEOSERVER_DATA_DIR/controlflow.properties # don't allow more than 16 GetMap requests in parallel ows.wms.getmap=16 Control flow 17% FOSS4G Europe 2014, Bremen 14th-17th July 2014
  45. 45. Resource Limits  They need to be tweaked together with Control- Flow  Limiting individual requests  Resource Limits  Limiting amount of parallel request  Control Flow  When time is involved make sure you keep into account all pieces of the response chain  E.G. Limited rendering time for WMS on data coming from WMS, items to take into account are  LifeTime of DB Connection (usually long)  WaitTime for a new connection (we don’t want to queue requests at the connection pool when they area already eating memory!) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  46. 46. JVM and deploy configuration
  47. 47. Premise  The options discussed here are not going to help visibly if you did not prepare the data and the styles  They are finishing touches that can get performance up once the major data bottlenecks have been dealt with  Check “Running in production” instructions here FOSS4G Europe 2014, Bremen 14th-17th July 2014
  48. 48. JVM settings  --server: enables the server JIT compiler  --Xms2048m -Xmx2048m: sets the JVM use two gigabytes of memory  --XX:+UseParallelOldGC -XX:+UserParallelGC: enables multi-threaded garbage collections, useful if you have more than two cores  --XX:NewRatio=2: informs the JVM there will be a high number of short lived objects  --XX:+AggressiveOpt: enable experimental optimizations that will be defaults in future versions of the JVM FOSS4G Europe 2014, Bremen 14th-17th July 2014
  49. 49. Setup a local cluster  Oracle Java2D locks when drawing antialiased vectors  Limits scalability severely  Two options  Use OpenJDK, it’s slower at rendering but scales up well  Use Apache mod_proxy_balance and setup a GeoServer each 2/4 cores mod_proxy_balance GeoServer GeoServer GeoServer FOSS4G Europe 2014, Bremen 14th-17th July 2014
  50. 50. Clustering advantage 66%  FOSS4G 2010 vector benchmarks (roads/buildings/isolines and so on, over the entire Spain)  GeoServer 2.2.x was benchmarked using Oracle JDK without local clustering FOSS4G Europe 2014, Bremen 14th-17th July 2014
  51. 51. Marlin renderer  The OpenJDK Java2D renderer scales up, but it’s not super-fast when the load is small (1 request at a time)  Marlin-renderer to the rescue: https://github.com/bourgesl/marlin-renderer  Complex map, 10 parallel requests, different zoom levels have different details showing up FOSS4G Europe 2014, Bremen 14th-17th July 2014
  52. 52. Upgrade!  Performance tends to go up version by version  Please do use a recent GeoServer version  FOSS4G 2010 vector benchmark with different versions of GeoServer FOSS4G Europe 2014, Bremen 14th-17th July 2014
  53. 53. The End Questions? andrea.aime@geo-solutions.it simone.giannecchini@geo-solutions.it FOSS4G Europe 2014, Bremen 14th-17th July 2014
  54. 54. GeoServer on steroids, the detailed version All you wanted to know about how to make GeoServer faster but you never asked (or you did and no one answered) Ing. Andrea Aime, GeoSolutions Ing. Simone Giannecchini, GeoSolutions FOSS4G Europe 2014, Bremen 14th-17th July 2014
  55. 55. Preparing raster inputs
  56. 56. Raster Data CheckList  Objectives  Fast extraction of a subset of the data  Fast extraction of overviews  Check-list  Avoid having to open a large number of files per request  Avoid parsing of complex structures  Avoid on-the-fly reprojection (if possible)  Get to know your bottlenecks  CPU vs Disk Access Time vs Memory  Experiment with  Format, compression, different color models, tile size, overviews, configuration (in GeoServer of course)
  57. 57. Problematic Formats  PNG/JPEG direct serving  Bad formats (especially in Java)  No tiling (or rarely supported)  Chew a lot of memory and CPU for decompression  Mitigate with external overviews  NetCDF/grib1 for large images (large width and height)  Complex formats (often with many subdatasets)  Often contains un-calibrated data  Must usually use multiple dimensions  Use ImageMosaic  Must usually massage the data before serving  e.g. transpose X,Y, FOSS4G Europe 2014, Bremen 14th-17th July 2014
  58. 58. Problematic Formats  Ascii Grid, GTOPO30, IDRISI and similar formats are bad  ASCII formats are bad  No internal tiling, no compression, no internal overviews  JPEG2000 (with Kakadu)  Extensible and rich, not (always) fast  Can be difficult to tune for performance (might require specific encoding options)  ECW and MrSID  Very fast on some types of data  Needs to be tuned to be performant FOSS4G Europe 2014, Bremen 14th-17th July 2014
  59. 59. Choosing Formats and Layouts  To remember: GeoTiff is a swiss knife  But you don’t want to cut a tree with it!  Tremendously flexible, good fir for most (not all) use cases  BigTiff pushes the GeoTiff limits farther  Single File VS Mosaic VS Pyramids  Use single GeoTiff when  Overviews and Tiling stay within 4GB  No additional dimensions  Consider BigTiff for very large file (> 4 GB)  Support for tiling  Support for Overviews  Can be inefficient with very large files + small tiling FOSS4G Europe 2014, Bremen 14th-17th July 2014
  60. 60. Choosing Formats and Layouts  Use ImageMosaic when:  A single file gets too big (inefficient seeks, too much metadata to read, etc..)  Multiple Dimensions (time, elevation, others..)  Avoid mosaics made of many very small files  Single granules can be large  Use Tiling + Overviews + Compression on granules  Use ImagePyramid when:  Tremendously large dataset  Too many files / too large files  Need to serve at all scales  Especially low resolution  For single granules (< 2Gb) GeoTiff is generally a good fit FOSS4G Europe 2014, Bremen 14th-17th July 2014
  61. 61. Choosing Formats and Layouts  Use ImageMosaic when:  A single file gets too big (inefficient seeks, too much metadata to read, etc..)  Multiple Dimensions (time, elevation, others..)  Avoid mosaics made of many very small files  Single granules can be large  Use Tiling + Overviews + Compression on granules  Use ImagePyramid when:  Tremendously large dataset  Too many files / too large files  Need to serve at all scales  Especially low resolution  For single granules (< 2Gb) GeoTiff is generally a good fit FOSS4G Europe 2014, Bremen 14th-17th July 2014
  62. 62. Choosing Formats and Layouts  Examples:  Small dataset: single 2GB GeoTiff file  Medium dataset: single 40GB BigTiff  Large dataset: 400GB mosaic made of 10GB BigTiff files  Extra large: 4TB of imagery, built as pyramid of mosaics of BigTiff/GeoTiff files to keep the file count low FOSS4G Europe 2014, Bremen 14th-17th July 2014
  63. 63. GeoTiff preparation  STEP 0: get to know your data  gdalinfo is your friend  CheckList  Missing CRS  Add a .prj file  Fix with gdal_translate  Missing georeferencing  Add a World File  Fix with gdal_translate  Bad Tiling  Fix with gdal_translate  Missing Overviews  Use gdaladdo  Compression  Use gdal_translate FOSS4G Europe 2014, Bremen 14th-17th July 2014
  64. 64. GeoTiff preparation  STEP 1: fix and optimize with gdal_translate  CRS and GeoReferencing  gdal_translate –a_srs “EPSG:4326” –a_ullr -180 0 -90 90 in.tif out.tif  Inner Tiling  gdal_translate -co "TILED=YES" -co "BLOCKXSIZE=512" -co "BLOCKYSIZE=512" in.tif out.tif  Check also GeoTiff driver creation options here  STEP 2: add overviews with gdal_addo  Leverages on tiff support for multipage files and reduced resolution pages  gdaladdo -r cubic output.tif 2 4 8 16 32 64 128  Choose the resampling algorithm wisely  Chose the tile size and compression wisely (use GDAL_TIFF_OVR_BLOCKSIZE)  Consider external overviews FOSS4G Europe 2014, Bremen 14th-17th July 2014
  65. 65. GeoTiff preparation STEP 1: fix and optimize with gdal_translate • CRS and GeoReferencing FOSS4G Europe 2014, Bremen 14th-17th July 2014
  66. 66. GeoTiff preparation STEP 1: fix and optimize with gdal_translate • Inner Tiling FOSS4G Europe 2014, Bremen 14th-17th July 2014
  67. 67. GeoTiff preparation STEP 2: add overviews with gdal_addo FOSS4G Europe 2014, Bremen 14th-17th July 2014
  68. 68. GeoTiff preparation  Compression  Consider when disk speed/space is an issue  Control it with gdal_translate and creation options  GeoTiff tiles can be compressed  LZW/Deflate are good for lossless compression  JPEG is good for visually lossless compression  From experience  Use LZW/Deflate on geophysical data (DEM, acquisitions)  USE JPEG visually lossless with Photometric Interpretation to YCbCr for RGB FOSS4G Europe 2014, Bremen 14th-17th July 2014
  69. 69. GeoTiff preparation Compression: FOSS4G Europe 2014, Bremen 14th-17th July 2014
  70. 70. GeoTiff preparation Test on a GeoTIFF image with(and without) the following features: • Tiling • Overview • Compression Results: • Overview increases performances by more than 6 times in respect of an image without it. • An uncompressed image increases the GeoServer performances by 70%. NOTE: All the tests in this section are performed on a 4 core PC with 16Gb RAM and GeoServer 2.4. FOSS4G Europe 2014, Bremen 14th-17th July 2014
  71. 71. GeoTiff preparation Test on a GeoTIFF JPEG Compressed image with(and without) TurboJPEG acceleration: Results: • TurboJPEG gives a 20% better response in presence of the overview, and 6% for the other cases. FOSS4G Europe 2014, Bremen 14th-17th July 2014
  72. 72. Time, Elevation and other dimensions  Use Cases:  MetOc data (support for time, elevation)  Data with additional indipendent dimensions  WorkFlow  Split in multiple GeoTiff files  Optimize the files individually  Use ImageMosaic  Use a DBMS for indexing granules  Use File Name based property collectors to turn properties into DB rows attributes  Filter by time, elevation and other attributes via OGC and CQL filters  Check back up slides for more info! FOSS4G Europe 2014, Bremen 14th-17th July 2014
  73. 73. Time, Elevation and other dimensions  Indexing multiple dimensions with DB support (video here) datastore.properties stringregex.properties timeregex.properties indexer.properties FOSS4G Europe 2014, Bremen 14th-17th July 2014
  74. 74. Time, Elevation and other dimensions FOSS4G Europe 2014, Bremen 14th-17th July 2014
  75. 75. Proper Mosaic Preparation  ImageMosaic stitches single granules together with basic processing  Filtered selection  Overviews/Decimation on read  Over/DownSampling in memory  ColorMask (optional)  Mosaic/Stitch  ColorMask again (optional)  Optimize files as if you were serving them individually  Keep a balance between number and dimensions of granules FOSS4G Europe 2014, Bremen 14th-17th July 2014
  76. 76. Proper Mosaic Configuration  STEP 0: Configure Coverage Access (see slide 34)  STEP 1: Configure Mosaic Parameters  ALLOW_MULTITHREADING  Load data from different granules in parallel  Needs USE_JAI_IMAGE_READ set to false (Immediate Mode)  Use a proper Tile Size  In-memory processing, must not be too large  Disk tiling should larger  If memory is scarce:  USE_JAI_IMAGREAD to true  USE_MULTITHREADING to false*  Otherwise  USE_JAI_IMAGREAD to false  ALLOW_MULTITHREADING to true FOSS4G Europe 2014, Bremen 14th-17th July 2014
  77. 77. Advanced Mosaic Configuration  Optional (Advanced): Configure Mosaic Parameters Directly  Caching  Load the index in memory (using JTS SRTree)  Super fast granule lookup, good for shapefiles  Bad if you have additional dimension to filter on  Based on Soft References, controlled via Java switch SoftRefLRUPolicyMSPerMB  ExpandToRGB  Expand colormapped imagery to RGB in memory  Trade performance for quality  SuggestedSPI  Default ImageIO Decoder class to use  Don’t touch unless expert FOSS4G Europe 2014, Bremen 14th-17th July 2014
  78. 78. Proper Mosaic Configuration  Test on a Mosaic Image  USE_JAI_IMAGREAD(IR) set to true and ALLOW_MULTITHREADING(MT) set to false.  ALLOW_MULTITHREADING set to true and USE_JAI_IMAGREAD set to false. Results: • The use of MULTITHREADING gives a 30% better performance. FOSS4G Europe 2014, Bremen 14th-17th July 2014
  79. 79. Proper Pyramid Preparation  Use gdal_retile for creating the pyramid  Prepare the list of tiles to be retiled  Create the pyramid with GDAL retile (grab a coffee!)  Chunks should not be too small (here 2048x2048)  Too many files is bad anyway  Use internal Tiling for Larger chunks size  If the input dataset is huge use the useDirForEachRow option  Too many files in a dir is bad practice  Make sure the number of level is consistent  Too few  bad performance at high scale FOSS4G Europe 2014, Bremen 14th-17th July 2014
  80. 80. Proper Pyramid Configuration  STEP 0: Configure Coverage Access (see slide 34)  STEP 1: Configure Pyramid Parameters  ImagePyramid relies on ImageMosaic  ALLOW_MULTITHREADING  Load data from different granules in parallel  Needs USE_JAI_IMAGE_READ set to false (Immediate Mode)  Use a proper Tile Size  In-memory processing, must not be too large  Disk tiling should be larger  If memory is scarce:  USE_JAI_IMAGREAD to true  USE_MULTITHREADING to false*  Otherwise  USE_JAI_IMAGREAD to false  ALLOW_MULTITHREADING to true FOSS4G Europe 2014, Bremen 14th-17th July 2014
  81. 81. Proper Pyramid Configuration  Optional (Advanced): Configure Mosaic Parameters Directly  Caching  Load the index in memory (using JTS SRTree)  Super fast granule lookup, good for shapefiles  Bad if you have additional dimension to filter on  Based on Soft References, controlled via Java switch SoftRefLRUPolicyMSPerMB  ExpandToRGB  Expand colormapped imagery to RGB in memory  Trade performance for quality  SuggestedSPI  Default ImageIO Decoder class to use  Don’t touch unless expert FOSS4G Europe 2014, Bremen 14th-17th July 2014
  82. 82. Proper GDAL Formats Configuration  Fix Missing/Improper CRS with PRJ or coverage config  Fix Missing GeoReferencing with World File  Make sure GDAL_DATA is properly configured  Use a proper Tile Size  In-memory processing, must not be too large  Fundamental for striped data! JNI overhead  Disk tiling should be larger  If memory is scarce:  USE_JAI_IMAGREAD to true  USE_MULTITHREADING to true*  Otherwise  USE_JAI_IMAGREAD to false  USE_MULTITHREADING is ignored FOSS4G Europe 2014, Bremen 14th-17th July 2014
  83. 83. Proper GDAL Formats Configuration Test on a ECW image with and without enabling ImageRead: • ECW is a GDAL supported format. Results:  If ImageRead is not used, then the performances are increased by more than 1,5 times. FOSS4G Europe 2014, Bremen 14th-17th July 2014
  84. 84. Proper JPEG2000 Kakadu Configuration  Fix Missing/Improper CRS with PRJ or coverage config  Fix Missing GeoReferencing with World File  Make sure Kakadu dll/so is properly loaded  Use a proper Tile Size  In-memory processing  Must not be too large  Disk tiling should larger  If memory is scarce:  USE_JAI_IMAGREAD to true  USE_MULTITHREADING to true*  Otherwise  USE_JAI_IMAGREAD to false  USE_MULTITHREADING is ignored FOSS4G Europe 2014, Bremen 14th-17th July 2014
  85. 85. Proper GeoServer Coverage Options Configuration  Make sure native JAI and Image is installed  Enable ImageIO native acceleration  Enable JAI Mosaicking native acceleration  Give JAI enough memory  Don’t raise JAI memory Threshold too high  Rule of thumb: use 2 X #Core Tile Threads (check next slide)  Enable Tile Recycling only on trunk  Enable Tile Recycling if memory is not a problem FOSS4G Europe 2014, Bremen 14th-17th July 2014
  86. 86. Proper GeoServer Coverage Options Configuration  Multithreaded Granule Loading  Allows to fine tuning multithreading for ImageMosaic  Orthogonal to JAI Tile Threads  Rule of Thumb: use 2 X #Core Tile Threads  Perform testing to fine tune depending on layer configuration as well as on typical requests  ImageIO Cache threshold  decide when we switch to disk cache (very large WCS requests) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  87. 87. Reprojection Performance Vs Quality  GeoServer (since 2.1.x) reprojects raster data using a piecewise-linear algorithm  The area is divided in rectangular blocks, each having its own affine transform  The transformation between the full trigonometric expressions and the linear ones is driven by a tolerance, default value is 0.333  Larger value will make reprojection faster, but lower the quality  - Dorg.geotools.referencing.resampleTolerance=0.5 FOSS4G Europe 2014, Bremen 14th-17th July 2014
  88. 88. Preparing vector inputs
  89. 89. Vector data checklikst  What do we want from vector data:  Binary data  No complex parsing of data structures  Fast extraction of a geographic subset  Fast filtering on the most commonly used attributes FOSS4G Europe 2014, Bremen 14th-17th July 2014
  90. 90. Choosing a format  Slow formats  WFS  GML  DXF  Good formats, local and indexable  Shapefile  Directory of shapefiles  SDE  Spatial databases: PostGIS, Oracle Spatial, DB2, MySQL*, SQL server* FOSS4G Europe 2014, Bremen 14th-17th July 2014
  91. 91. Shapefiles vs DBMS  Speed comparison vs spatial extent depicted:  Shapefile very fast when rendering the full dataset  Database faster when extracting a small subset of a very large data set  Shapefile  no attribute indexing, avoid if filtering on attribute is important (filtering == reading less data, not applying symbols)  Database  Rich support for complex native filters  Use connection pooling (preferably via JNDI)  Validate connections (with proper pooling) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  92. 92. DBMS Checklist  Rich support for complex native filters  Use connection pooling (preferably via JNDI)  Validate connections (with proper pooling)  Spatial Indexing  Spatial Indexing  Spatial Indexing  Alphanumeric Indexing  Alphanumeric Indexing  Alphanumeric Indexing  Table Clustering  Use views to remove unused attributes FOSS4G Europe 2014, Bremen 14th-17th July 2014
  93. 93. Shapefile preparation  Remove .qix file if present, let GeoServer 2.1.x rebuild it (more efficient)  If there are large DBF attributes that are not in use, get rid of them using ogr2ogr, e.g.: ogr2ogr -select FULLNAME,MTFCC arealm.shp tl_2010_08013_arealm.shp  If on Linux, enable memory mapping, faster, more scalable (but will kill Windows): FOSS4G Europe 2014, Bremen 14th-17th July 2014
  94. 94. Shapefile filtering  Stuck with shapefiles and have scale dependent rules like the following?  Show highways first  Show all streets when zoomed in  Use ogr2ogr to build two shapefiles, one with just the highways, one with everything, and build two layers, e.g.: ogr2ogr -sql "SELECT * FROM tl_2010_08013_roads WHERE MTFCC in ('S1100', 'S1200')" primaryRoads.shp tl_2010_08013_roads.shp  Or hire us to develop non-spatial indexing for shapefile! FOSS4G Europe 2014, Bremen 14th-17th July 2014
  95. 95. PostGIS specific hints  PostgreSQL out of the box configured for very small hardware: http://wiki.postgresql.org/wiki/Performance_Optimization  Make sure to run ANALYZE after data imports (updates optimizer stats)  As usual, avoid large joins in SQL views, consider materialized views  If the dataset is massive, CLUSTER on the spatial index:  http://postgis.refractions.net/documentation/manual- 1.3/ch05.html FOSS4G Europe 2014, Bremen 14th-17th July 2014
  96. 96. PostGIS specific hints  Careful with prepared statements (bad performances)  USE CASE: the layer’s style allows to display the whole layer in a single shot (no scale dependencies)  prepared statements will slow down execution  EXPLANATION: Postgis will choose to use the spatial index in all cases, this makes retrieving the full data set 2-4 times slower than when using a sequential scan  COUNTERMEASURE: Not using prepared statement allows postgis to figure out a suitable plan based on the request bbox instead (assuming someone run "vacuum analyze" on the database to update the index statistics, and of course, provided there is a spatial index to start with) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  97. 97. Optimize styling
  98. 98. Use scale dependencies  Never show too much data  the map should be readable, not a graphic blob. Rule of thumb: 1000 features max in the display FOSS4G Europe 2014, Bremen 14th-17th July 2014
  99. 99. Labeling  Labeling conflict resolution is expensive, limit to the most inner zooms  Halo is important for readability, but adds significant overhead  Careful with maxDisplacement, makes for various label location attempts FOSS4G Europe 2014, Bremen 14th-17th July 2014
  100. 100. FeatureTypeStyle  GeoServer uses SLD FeatureTypeStyle objects as Z layers for painting  Each one allocates its own rendering surface (which can use a lot of memory), use as few as possible FOSS4G Europe 2014, Bremen 14th-17th July 2014
  101. 101. Use translucency sparingly  Translucent display is expensive, use it sparingly  e.g. translucent fill <CssParameter name="fill-opacity">0.5</CssParameter> FOSS4G Europe 2014, Bremen 14th-17th July 2014
  102. 102. Scale dependent rules  Too often forgotten or little used, yet very important:  Hide layers when too zoomed in (raster/vector example)  Progressively show details  Add more expensive rendering when there are less features  Key to any high performance / good looking map FOSS4G Europe 2014, Bremen 14th-17th July 2014
  103. 103. Example FOSS4G Europe 2014, Bremen 14th-17th July 2014
  104. 104. Hide as you zoom in  Add a MinScaleDenominator to the rule  This will make the layer disappear at 1:75000 (towards 1:1) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  105. 105. Alternative rendering  Simple rendering at low scale (up to 1:2000)  More complex rendering when zoomed in (1:1999 and above) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  106. 106. Alternative rendering FOSS4G Europe 2014, Bremen 14th-17th July 2014
  107. 107. Point symbols • 600 loc for 6 different points types • Painful… FOSS4G Europe 2014, Bremen 14th-17th July 2014
  108. 108. Prepare data  alter table pointlm add column image varchar;  update pointlm set image = 'shop_supermarket.p.16.png' where MTFCC = 'C3081' and (FULLNAME like '%Shopping%' or FULLNAME like '%Mall%');  update pointlm set image = 'peak.png' where MTFCC = 'C3022'  update pointlm set image = 'amenity_prison.p.20.png' where MTFCC = 'K1236';  update pointlm set image = 'museum.p.16.png' where MTFCC = 'K2165';  update pointlm set image = 'airport.p.16.png' where MTFCC = 'K2451';  update pointlm set image = 'school.png' where MTFCC = 'K2543';  update pointlm set image = 'christian3.p.14.png' where MTFCC = 'K2582';  update pointlm set image = 'gate2.png' where MTFCC = 'K3066'; FOSS4G Europe 2014, Bremen 14th-17th July 2014
  109. 109. Dynamic symbolizers FOSS4G Europe 2014, Bremen 14th-17th July 2014
  110. 110. Output tuning
  111. 111. WMS output formats JPEG PNG 8bit PNG 24bit 23.8KB 169.4KB66KB 64KB27KB27KB Compression artifacts Color reduction Large size FOSS4G Europe 2014, Bremen 14th-17th July 2014
  112. 112. LibJPEG-Turbo WMS Output Format  GeoServer Extension  Leverages LibJPEG-Turbo for accelerate JPEG encoding  40% to 80% increase in throughput  Up to 40% decrease in average response times  Check our blog post here FOSS4G Europe 2014, Bremen 14th-17th July 2014
  113. 113. LibJPEG-Turbo WMS Output Format FOSS4G Europe 2014, Bremen 14th-17th July 2014
  114. 114. Available Color Quantizer  Paletted Images are lighter to move around!  Options: Precompute VS Compute on-the-fly  Precomputed palettes are fast but ugly  ON/OFF Transparency, no antinalising  On-the-fly palette computation options  Octree  fast, supports ON/OFF Transparency. Default for opaque images  Mediancut  slower, supports full Transparency. Default for translucent images  Check this page and this one as well in the GeoServer doc FOSS4G Europe 2014, Bremen 14th-17th July 2014
  115. 115. WFS output formats 0 5 10 15 20 25 30 35 Dimension MB  HTTP GZip compression is transparent in GeoServer, make sure proxies keep it (or pay 10x price) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  116. 116. Tiling & Caching
  117. 117. Tile caching with GeoWebCache  Tile oriented maps, fixed zoom levels and fixed grid  Useful for stable layers, backgrounds  Protocols: WMTS, TMS, WMS-C, Google Maps/Earth, VE  Speedup compared to dynamic WMS: 10 to 100 times, assuming tiles are already cached (whole layer pre- seeded)  Suitable for:  Mostly static layer  No (or few) dynamic parameters (CQL filters, SLD params, SQL query params, time/elevation, format options) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  118. 118. Embedded GWC advantage  No double encoding when using meta-tiling, faster seeding FOSS4G Europe 2014, Bremen 14th-17th July 2014
  119. 119. Space considerations  Seeding Colorado, assuming 8 cores, one layer, 0.1 sec 756x756 metatile, 15KB for each tile  Do yours: http://tinyurl.com/3apkpss  Not enough disk space? Set a disk quota Zoom level Tile count Size (MB) Time to seed (hours) Time to seed (days) 13 58,377 1 0 0 14 232,870 4 0 0 15 929,475 14 0 0 16 3,713,893 57 1 0 17 14,855,572 227 6 0 18 59,396,070 906 23 1 19 237,584,280 3,625 92 4 20 950,273,037 14,500 367 15 FOSS4G Europe 2014, Bremen 14th-17th July 2014
  120. 120. More Tweaks  Client-side caching of tiles  Does not work with browsers in private mode <expireClientsList> <expirationRule minZoom="0" expiration="7200" /> <expirationRule minZoom="10" expiration="600" /> </expireClientsList> FOSS4G Europe 2014, Bremen 14th-17th July 2014
  121. 121. More Tweaks  Use the right formats  JPEG for background data (e.g. ortos)  PNG8 + precomputed palette for background data (e.g. ortos)  PNG full for overlays with transparency  PNG8 full for overlays with transparency  Don’t compress things twice!  The format impacts also the disk space needed! (as well as the generation time)  Check this blog post FOSS4G Europe 2014, Bremen 14th-17th July 2014
  122. 122. Resource control
  123. 123. Resource Limits  Improve reliability and stability via limiting the amount of resources dedicated to an individual request  Improve fairness between requests, by preventing individual requests from hijacking the server and/or running for a very long time  EXTREMELY IMPORTANT in production environment  WHEN TO TWEAK THEM?  Frequent OOM Errors despite plenty of RAM  Requests that keep running for a long time (e.g. CPU usage peaks even if no requests are being sent)  DB Connection being killed by the DBMS while in usage (ok, you might also need to talk to the DBA..) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  124. 124. Resource Limits  They need to be tweaked together with Control- Flow  Limiting individual requests  Resource Limits  Limiting amount of parallel request  Control Flow  When time is involved make sure you keep into account all pieces of the response chain  E.G. Limited rendering time for WMS on data coming from WMS, items to take into account are  LifeTime of DB Connection (usually long)  WaitTime for a new connection (we don’t want to queue requests at the connection pool when they area already eating memory!) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  125. 125. WMS request limits  Max memory per request: avoid large requests, allows to size the server memory (max concurrent request * max memory)  Max time per request: avoid requests taking too much time (e.g., using a custom style provided with dynamic SLD in the request)  Max errors: best effort renderer, but handling errors takes time FOSS4G Europe 2014, Bremen 14th-17th July 2014
  126. 126. WFS request limits  Max feature returned, configured as a global limit  Return feature bbox: reduce amount of generated GML  Per layer max feature count FOSS4G Europe 2014, Bremen 14th-17th July 2014
  127. 127. WCS request limits FOSS4G Europe 2014, Bremen 14th-17th July 2014
  128. 128. Control flow  Control how many requests are executed in parallel, queue others:  Increase throughput  Control memory usage  Enforce fairness  More info here FOSS4G Europe 2014, Bremen 14th-17th July 2014
  129. 129. $GEOSERVER_DATA_DIR/controlflow.properties # don't allow more than 16 GetMap requests in parallel ows.wms.getmap=16 Control flow 17% FOSS4G Europe 2014, Bremen 14th-17th July 2014
  130. 130. Auditing  Log each and every request  Log contents driven by customizable template  Summarize and analyze requests with offline tools  More info here FOSS4G Europe 2014, Bremen 14th-17th July 2014
  131. 131. JVM and deploy configuration
  132. 132. Premise  The options discussed here are not going to help visibly if you did not prepare the data and the styles  They are finishing touches that can get performance up once the major data bottlenecks have been dealt with  Check “Running in production” instructions here FOSS4G Europe 2014, Bremen 14th-17th July 2014
  133. 133. JVM settings  --server: enables the server JIT compiler  --Xms2048m -Xmx2048m: sets the JVM use two gigabytes of memory  --XX:+UseParallelOldGC -XX:+UserParallelGC: enables multi-threaded garbage collections, useful if you have more than two cores  --XX:NewRatio=2: informs the JVM there will be a high number of short lived objects  --XX:+AggressiveOpt: enable experimental optimizations that will be defaults in future versions of the JVM FOSS4G Europe 2014, Bremen 14th-17th July 2014
  134. 134. Native JAI and JDK  Install native JAI and use a recent Sun JDK!  Benchmark over a small data set (the effect is not as visible on larger ones) FOSS4G Europe 2014, Bremen 14th-17th July 2014
  135. 135. Setup a local cluster  Oracle Java2D locks when drawing antialiased vectors  Limits scalability severely  Two options  Use OpenJDK, it’s slower at rendering but scales up well  Use Apache mod_proxy_balance and setup a GeoServer each 2/4 cores mod_proxy_balance GeoServer GeoServer GeoServer FOSS4G Europe 2014, Bremen 14th-17th July 2014
  136. 136. Clustering advantage 66%  FOSS4G 2010 vector benchmarks (roads/buildings/isolines and so on, over the entire Spain)  GeoServer 2.2.x was benchmarked using Oracle JDK without local clustering FOSS4G Europe 2014, Bremen 14th-17th July 2014
  137. 137. Marlin renderer  The OpenJDK Java2D renderer scales up, but it’s not super-fast when the load is small (1 request at a time)  Marlin-renderer to the rescue: https://github.com/bourgesl/marlin-renderer  Complex map, 10 parallel requests, different zoom levels have different details showing up FOSS4G Europe 2014, Bremen 14th-17th July 2014
  138. 138. Upgrade!  Performance tends to go up version by version  Please do use a recent GeoServer version  FOSS4G 2010 vector benchmark with different versions of GeoServer FOSS4G Europe 2014, Bremen 14th-17th July 2014
  139. 139. Benchmarking
  140. 140. Using JMeter  Good benchmarking tool  Allows to setup multiple thread groups, different parallelelism and request count, to ramp up the load  Can use CSV files to generate semi-randomized requests  Reports results in a simple table http://jakarta.apache.org/jmeter/ FOSS4G Europe 2014, Bremen 14th-17th July 2014
  141. 141. Using JMeter  Thread group: how many threads  Loop: how many requests  HTTP sampler: the request  CSV: read request params from CSV  Summary table FOSS4G Europe 2014, Bremen 14th-17th July 2014
  142. 142. Generating the CSV  Simple randomized generation tool built during WMS shootouts, wms_request.py  Generate csv with the bbox and width/height to be used in JMeter scripts: ./wms_request.py -count 1200 -region -180 -90 180 90 -minres 0.002 -maxres 0.1 -minsize 256 256 -maxsize 1024 1024  Get it here along with a corresponding JMeter script: http://demo1.geo-solutions.it/share/jmeter_2011.zip FOSS4G Europe 2014, Bremen 14th-17th July 2014
  143. 143. Checking results  Results table  Run the benchmarks 2-3 times, let the results stabilize  Save the results, check other optimizations, compare the results FOSS4G Europe 2014, Bremen 14th-17th July 2014
  144. 144. Real world deploy
  145. 145. Deploy configuration FOSS4G Europe 2014, Bremen 14th-17th July 2014
  146. 146. Raster data  Whole Italy at 50cm per pixel  Over 4TB, updated fully every 3 years (old data still available for historical access)  Custom pyramid  100 m per pixel: one image  20m per pixel: mosaic of 20 tiles  4m per pixel: mosaic of few hundred tiles  0.5m per pixel: 9000 tiles  Each tile is 10000x10000, with overviews FOSS4G Europe 2014, Bremen 14th-17th July 2014
  147. 147. Vector data  Cadastral data for the whole Italy, with full history (interval of validity for each parcel)  100 million polygons  A query extracts a subset relative to a certain time interval and area the user is allowed to see  No data from this table is ever shown below 1:50000 (SLD scale dependencies)  Physical table level partitioning (Oracle style) of the table based on geographic area to parallelize and cluster data loading, plus spatial indexing and indexes on commonly filtered upon attributes FOSS4G Europe 2014, Bremen 14th-17th July 2014
  148. 148. The End Questions? andrea.aime@geo-solutions.it simone.giannecchini@geo-solutions.it FOSS4G Europe 2014, Bremen 14th-17th July 2014

×