Justin: -explain the two out of the box behave quite differently - fast cgi vs database connection pooling - tried to make as equal possible ==> GeoSever doing < 1 bit png output (ask andrea about appropriate terminology)
Justin: - Breif overview of WMS for the layman WMS Performance
Brock: WMS Performance
Brock: - random bounding box generation - exact same requests for each server - 10,000 subset of 3 million WMS Performance
Brock: Response time in milliseconds on the Y-axis. Three interesting ways to look at the graph - PostGIS vs. Shapefile - Indexing (large data set vs. small data set) - Mapserver vs. Geoserver - Geoserver had many code changes to improve performance. WMS Performance
Justin: - explain how the concurrency, we chose a “worst case” concurrency model, there are many others WMS Performance
Justin: - suprising results - expected postgis to perform much better under a load - however, read-only access means no locking for shapefiles - Mapserver with fast cgi with 20 processes - GeoServer with connectino pooling at 20 connections WMS Performance
Justin: - same data, different metric, througput - mapserver remains consistent for longer, geoserver starts to degrade around 15 concurrent requests (garbage collector?) Wms Performance
Brock: This test only scratches the surface. At least it gives a sense of how much on-the-fly reprojection affects response time. WMS Performance
Brock: What is FastCGI? Conclusion: use FastCGI unless you have compelling reasons no to. WMS Performance
Brock WMS Performance
Justin WMS Performance
Brock does mapserver Justin does GeoServer - out of the box geoserver much slower ==> defautl 24 bit rendering, logging on, default styling has transparency , anti-aliasing on by default
Brock WMS Performance
Introduction to uDig
Mapserver vs. geoserver
W W W . R E F R A C T I O N S . N E TWMS Performance Tests!Mapserver & Geoserver FOSS4G 2007 Shapefiles vs. PostGIS, Concurrency, and other exciting tests... Presented by Brock Anderson and Justin Deoliveira
W W W . R E F R A C T I O N S . N E TPresentation Outline• Goals of testing.• Quick review of WMS.• Description of the test environment.• Discussion of performance tests and results.• Questions.
W W W . R E F R A C T I O N S . N E TGoals1. Compare performance of WMS GetMap requests in Mapserver and Geoserver.2. Identify configuration settings that will improve performance.3. Identify and fix inefficiencies in Geoserver. We do not test stability, usability, etc.,* We do not test styling or labelling. We focus on vector input.
W W W . R E F R A C T I O N S . N E TKeeping the tests fair• Not an easy job!• We tried to understand what each server doesunder the hood to ensure were not accidentallyperforming unnecessary processing on either server.
W W W . R E F R A C T I O N S . N E TWeb Map Service (WMS) http://server.org/wms? request=getmap& layers=states,lakes& bbox=-85,36,-60,49& format=png&... WMSUser A Map
W W W . R E F R A C T I O N S . N E TTest EnvironmentClient Computer Server Computer Apache 2.2.4 (with mod_fcgi) Vector Data Mapserver 4.10.2 WMS requests JMeter Data 2.2 Tomcat 6.0.14 WMS requests Shapefiles Geoserver 1.6 beta 3 Additional Server Specs: Dual core (1.8Ghz per core). 2GB RAM. 7200RPM disk. Linux. PostgreSQL 8.2.4. PostGIS 1.2.
W W W . R E F R A C T I O N S . N E TTest #1: PostGIS vs. Shapefiles• Two Data Sets: 3,000,000 Tiger roads in Texas 10,000 Tiger roads in Dallas, Texas• Both data sets are in PostGIS and shapefile format.• Spatial indexes on both data sets.• Mapserver and Geoserver layers point at the data.• Minimal styling.• JMeter issues WMS requests to fetch ~1,000 features, limited by the bbox parameter. And the results are...
W W W . R E F R A C T I O N S . N E T Test #1: PostGIS vs. Shapefiles Mapserver Geoserver 400 386 400 350 350Response time (millisec onds) 300 300 250 250 200 200 150 150 100 100 50 39 47 42 42 33 50 50 27 0 0 1,000 of 10,000 1,000 of 3,000,000 1,000 of 10,000 1,000 of 3,000,000 Notes: This test uses two different data sets: one with 3 million features, the other with 10,000. Each bar is an average of 30 sample WMS requests, each using a different bounding box to fetch and draw appx. 1000 features (+/- 15%). The same 30 requests are executed for each scenario. One request at a time (no concurrency). Mapserver and Geoserver use the same data. Mapserver is using FastCGI via Apache/mod_fcgi. Spatial indexes on both data sets. Quadtree indexes generated by shptree. No reprojection required. Minimal styling. Responses are 1-bit PNG images.
W W W . R E F R A C T I O N S . N E TTest #2: Concurrent Requests• Using the same tiger roads data set with 10,000 records.• We issue multiple requests with pseudo-random BBOXes that fetch approximately 1,000 features.• The main difference is that now were issuing multiple concurrent requests. Lets see what happened...
W W W . R E F R A C T I O N S . N E T Test #2: Concurrent Requests Mapserver Geoserver 1500 1400Response time (milliseconds) 1300 1200 1100 1000 900 800 700 500 600 300 400 100 200 -100 0 1 2 5 10 15 20 40 60 1 2 5 10 15 20 40 60 Notes: Data in PostGIS and shapefile formats. Mapserver and Geoserver use the same data. Mapserver is using FastCGI via Apache/mod_fcgi. 20 FastCGI mapserv processes. Geoserver uses connection pooling with 20 connections. Spatial indexes on both data sets. No reprojection required. Minimal styling. Responses are 2-color PNG images. More details in the appendix.
W W W . R E F R A C T I O N S . N E T ... or Throughput, if you prefer Mapserver Throughput Geoserver Throughput 80 80 70 70Responses per second 60 60 50 50 40 40 30 30 20 20 10 10 0 0 1 2 5 10 15 20 40 60 1 2 5 10 15 20 40 60 # Concurrent Requests # Concurrent Requests An alternative way to summarize the data collected for the concurrency test. (Higher lines are better here.)
W W W . R E F R A C T I O N S . N E T Test #3: Reprojection Mapserver (using PROJ to reproject) Geoserver (using Geotools to reproject) 80 80 31 70 22 70 19 21 6 9 12Response time (milliseconds) 60 60 50 0 50 0 4 40 40 30 30 20 20 10 10 0 0 None Geog WGS84 – Geog WGS84 – Geog WGS84 – UTM 14N UTM 14N UTM 14N SPS NAD83 WGS84 - SPS None Geog WGS84 – Geog WGS84 – Geog WGS84 – UTM 14N WGS84 UTM 14N WGS84 UTM 14N NAD27 SPS NAD83 - SPS NAD83 WGS84 NAD27 NAD83PROJ optimizes by assuming these source Geotools is slightly faster thanand target datums are equivalent. PROJ for these cases.Currently Mapserver calls PROJ for every Geoserver simplifies geometry before reprojecting.vertex, but it could improve by batchingthose into a single call.
W W W . R E F R A C T I O N S . N E TCGI vs. FastCGI (Mapserver only) 90 81 80 Response time (milliseconds) 70 57 60 52 50 42 PostGIS 40 Shapefile 30 20 10 0 CGI FastCGI Notes: Average of 30 samples. One request at a time (no concurrency). Each request fetches one layer with 1000 features from a data set of 10,000. Spatial indices on both data sets. No reprojection required. Minimal styling. Responses are 1-bit PNG images. The same binary file was used for both CGI and FastCGI. FastCGI through Apache and mod_fcgi.
W W W . R E F R A C T I O N S . N E T Breakdown of Mapserver Response Time 90 Time (in milliseconds) 80 70 Network delay 60 Write image 50 Draw Fetch & store 40 Query 30 Connect to DB Load map file 20 Start mapserv process 10 0 PostGIS Shapefile• FastCGI eliminates Start mapserv process and Connect to DB costs.• The Write image step is dependant on output format.
W W W . R E F R A C T I O N S . N E T Breakdown of Geoserver Response Time404: Document not found
W W W . R E F R A C T I O N S . N E T Servlet Container and Java (Geoserver only) Jetty 6.0.2 Tomcat 6.0.14Response time (milliseconds) 200 179 200 160 160 120 120 Tomcat 6 95 95 doesnt 80 64 80 support 63 Java 1.4 40 40 0 0 Java 1.4 Java 5 Java 6 Java 1.4 Java 5 Java 6• These results show average response times for the same WMS request when Geoserver is backed by different Servlet containers and Java versions.• Using shapefile backend.• Conclusion: Use Java 6!
W W W . R E F R A C T I O N S . N E T Outcome of the tests• Lots of performance optimizations to Geoserver which will be available in version 1.6.• Identified a few places where Mapserver can improve too. (These will be reported as “bugs” as time permits.)• Both servers can be FAST, but require some special configuration.
W W W . R E F R A C T I O N S . N E T The Road to Speed Mapserver GeoserverResponse time (milliseconds) 1000 1000 800 800 600 600 400 400 200 200 0 0 Start (CGI) Switch to Re-order Output Start Logging Transp. Output JVM Code FastCGI epsg file format Off styles off format settings change Data sources with high All will be in Geoserver 1.6 connection overhead will benefit much more from FastCGI.
W W W . R E F R A C T I O N S . N E T Performance Tips (Mapserver)• Beware of PROJECTION init=epsg:4326 END The “init=” syntax causes one lookup in the PROJ4 epsg file for every occurrence in the map file. (Move your most-used EPSG codes to the top of the epsg file.)• Use FastCGI instead of ordinary CGI. Instruction here: http://mapserver.gis.umn.edu/docs/howto/fastcgi• Ensure you have enough FastCGI processes.
W W W . R E F R A C T I O N S . N E T Performance Tips (Geoserver)• Geoserver has many features enabled by default. Gain performance by disabling features you dont need. – Transparent styles double draw time. Use opacity=1 in your SLD to disable. – Antialiasing linework is costly. Try &format_options=antialias:none to disable. – Experiment with disabling “PNG native acceleration”• Favour Java 6 over Java 5 over Java 1.4.• JVM Settings: Increase heap size. Use -server switch.• Experiment with different shapefile index depths.• Turn off logging
W W W . R E F R A C T I O N S . N E T How can the servers improve?Mapserver Geoserver• More efficient scanning • Various optimizations to of shapefile quadtree the renderer. indexes. [ Bug Reported ] [ Fixes Committed ]• Batch PROJ calls when • More efficient scanning doing on-the-fly of shapefile quadtree reprojection. index. [ Fixes Committed ]• Reduce number of epsg lookups on map files.
W W W . R E F R A C T I O N S . N E TQuestions? Contact Us.Brock Anderson: firstname.lastname@example.orgJustin Deolivera: email@example.com
W W W . R E F R A C T I O N S . N E T General WMS Performance Tips• Only fetch from your data source the features that will be drawn, otherwise the servers have to spend time scanning and discarding the unused ones.• Output format affects response time. 256 color PNG is faster to create than PNG24 on both servers.• On-the-fly reprojection has a price. Store data in the same projection its most commonly requested in.
W W W . R E F R A C T I O N S . N E TAppendixBreakdown of Mapserver Response TimeThe graph represents mapserv running in CGI mode to show all startup costs. Metrics for “Load map file”,“Connect to DB”, “Fetch & store”, “Draw” and “Write image” were collected by modifying source code tocapture and log durarions of those operations. “Query” time measured with PostgreSQLs explain analyzecommand. “Start mapserv process” + “Network delay” = difference between response times recorded byJMeter and my custom mapserv logging which recorded the total time servicing a request. PostGIS Shapefile Start mapserv process 15ms 15ms Load map file 3ms 3ms Connect to DB 14ms n/a Query 20ms n/a Fetch 7ms n/a Draw 11ms 28ms Write image 8ms 8ms Network delay 3ms 3msPostGIS vs ShapefilesThis test uses two different data sets: one with 3,000,000 features, the other with 10,000. Each requestfetches 1000 features by limiting with a bbox WMS parameter. Each bar is an average of 30 samples.One request at a time (no concurrency). Mapserver and Geoserver use the same data. Mapserver is usingFastCGI via Apache/mod_fcgi. Spatial indices on both data sets. The shapefile indices were generatedwith shptree. No reprojection required. Minimal styling. Responses are 2-color PNG images (indexedcolor).The unusual Mapserver result for the case of a 3 million record shapefile has been reported to theMapserver bug tracker: http://trac.osgeo.org/mapserver/ticket/2282
W W W . R E F R A C T I O N S . N E TAppendixConcurrency and ThroughputNotes: Data in PostGIS and shapefile formats. Mapserver and Geoserver use the same data. Mapserver isusing FastCGI via Apache/mod_fcgi. 20 FastCGI mapserv processes. Geoserver uses connection poolingwith 20 connections. Spatial indexes on both data sets. No reprojection required. Minimal styling.Responses are 2-color PNG images (indexed color). “Concurrent” requests were fired in bursts with zeroramp up (as near to simultaneously as possible). I.e. For the test of 10 concurrent requests, all tenrequests were fired at the same time. Once all the responses came back then the next burst of requestswent out. Requests use random bboxes which fetch ~1000 features. The same random bboxes are usedagainst both servers. Mapserver (Response times) Geoserver (Response times) PostGIS Shapefile PostGIS Shapefile 1 50 39 1 42 27 2 51 40 Response times are measured in 2 43 30 5 91 75 5 81 47 milliseconds. Throughput times10 182 147 10 166 103 represent responses per second.15 269 229 15 261 162 The concurrency level is the left-most20 315 283 20 378 252 column in each table (1, 2, 5, 10, ...).40 784 612 40 747 51460 1269 905 60 1170 773 Mapserver (Throughput times) Geoserver (Throughput times) PostGIS Shapefile PostGIS Shapefile 1 19.6 24.9 1 24.6 35.6 2 28.2 33.4 2 32.3 41.8 5 35.4 51.6 5 47.1 68.6 10 38.4 53.8 10 49.9 74.1 15 42.5 55 15 49.2 73.3 20 42.4 54.1 20 47.7 68 40 43.2 54.9 40 48.3 68 60 43.1 51.5 60 47.8 70.7
W W W . R E F R A C T I O N S . N E TAppendixSummary of Geoserver code changes made to improve performance:• optimized access to the shapefile spatial index (it was reading tiny sections of the file instead of doingsome buffered access)• figure out the optmimal palette out of the SLD style (when possible, that is, when antialiasing is off)* dont access the dbf file when not necessary* avoid unecessary operations, like duplicating over and over the same coordinate during rendering(loading it, generalize, reproject, copy back in the geometry and so on, now the array its copied justonce)Raw list of changes here:http://jira.codehaus.org/secure/ManageLinks.jspa?id=55176