Difference between revisions of "Benchmarking 2010"
Line 508: | Line 508: | ||
** starting: <tt>/benchmarking/geoserver/start_geoserver.sh</tt> | ** starting: <tt>/benchmarking/geoserver/start_geoserver.sh</tt> | ||
** stopping gracefully: <tt>/benchmarking/geoserver/stop_geoserver.sh</tt> | ** stopping gracefully: <tt>/benchmarking/geoserver/stop_geoserver.sh</tt> | ||
− | ** check it's really dead: <tt>ps aux | + | ** check it's really dead: <tt>ps aux | grep GEOSERVER</tt> . If it hasn't stopped gracefully use <tt>kill -9 <pid></tt> to terminate the process |
* MapServer and QGIS Mapserver on the Linux box (those two should probably be separated): | * MapServer and QGIS Mapserver on the Linux box (those two should probably be separated): |
Revision as of 08:05, 19 August 2010
Basic Premise
Following up on last year's exercise, the performance shoot-out presentation at FOSS4G2010 will test how long each Web mapping server takes to generate a map image, from a common set of spatial data, on a common platform. The data will be served by each Web mapping server through the WMS standard, which will serve exactly the same set of LAYERS. A JMeter load will be run on the testing box to measure various aspects of those layers.
Participants
Mapping Server | Development Team Leader | IRC nick | Server status |
Cadcorp GeognoSIS | Martin Daly | mpdaly | Production release of server installed with preliminary configuration for SHP, PostGIS, TIF and ECW |
Constellation-SDI | Cédric Briançon | cedricbr | Running on port 8280. Installation done for Shapefiles. |
Erdas Apollo | Dimitri Monie begin_of_the_skype_highlighting end_of_the_skype_highlighting begin_of_the_skype_highlighting end_of_the_skype_highlighting begin_of_the_skype_highlighting end_of_the_skype_highlighting begin_of_the_skype_highlighting end_of_the_skype_highlighting | dmonie | Install and configuration on-going on windows server, ecw WMS configuration done, shape and DB WMS configuration on going. Port 80, port 8080 |
GeoServer | Andrea Aime | aaime | Normally running on port 8080 on the Linux server. Shapefiles and TIFF mosaic have been configured. |
inactive | |||
Mapnik | Dane Springmeyer | springmeyer | Running on port 8090 (paleoserver) 8091 (mod_mapnik_wms). Shapefiles styles ready, postgis and raster still to come. |
MapServer | Jeff McKenna | jmckenna | linux_wms_bm serving WMS requests for shps, on port 8081. windows wms_bm serving requests for shps, on port 8081. |
Oracle MapViewer | Michael Smith | msmith_ | Set up on port 8088 |
QGIS mapserver | Marco Hugentobler | marco___ | Set up on port 8081 (Linux server). Shapefiles and TIF (without reprojection) configured |
Timeline
January 1st, 2010 | begin contacting all mapping servers |
June 1st, 2010 | commitment due by all interested mapping servers |
June 2nd, 2010 | exercise begins (and weekly meetings) |
August 1st, 2010 | final testing begins |
August 23rd, 2010 | no further testing |
September 6-9, 2010 | present results at FOSS4G2010 |
Rules of Engagement
- All parties must contribute any changes that they make to their software for this exercise, back to their community. Note that the changes don't have to be contributed before the conference, just in a reasonable period of time.
- Comparisons will be made of the best available version of the software, be it a formal release or a development version.
- Two tests will be run: one 'baseline' test with the data in its raw format (with spatial indexes), and another 'best effort' test where 'the sky is the limit' for what changes you want to make to the data (change format, generalize, etc)
- Teams must document all steps they did to manipulate the data/server for both the 'baseline' and 'best effort' tests. If a team does not document the steps on this wiki then that team's test results will not be used.
- Data formats to be used will be shapefiles for vectors, and uncompressed geotiffs for rasters.
- WMS output formats to be used will be png8 and png24 where possible
Documenting Server Details and Differences
It is the responsibility of each team to document their setup with regard to data.
Please keep your notes in your servers directory in svn: http://svn.osgeo.org/osgeo/foss4g/benchmarking/
One particular thing that will differ is each team's use of spatial indexes on shapefiles.
Teams are not allowed to modify the .shp, .shx, .dbf or .prj files of the vector data (for the baseline benchmark). They can, however, create auxiliary files for spatial indexes and the like.
Mapping Server | ext | type | command to create |
MapServer | .qix | quadtree | shptree <shpfile> (shptree notes) |
GeoServer | .qix | quadtree | Using the ones generated for MapServer atm |
Mapnik | .index | quadtree | shapeindex <shpfile> (Benchmarking_2010/Mapnik_notes#shpindex_depth) |
Erdas | .rtr | RTree | RTreeBuilder basedir shapename max min (RTreeBuilder notes) |
Cadcorp | *.shp.idx | RTree | GUI command in desktop SIS (Benchmarking_2010/Cadcorp_notes) |
Constellation | .qix | quadtree | Using the ones generated by MapServer |
MapViewer | .oix | RTree | GUI option in MapBuilder.jar, java map/theme/style builder |
- Note: in your server setup documentation be sure to record the EXACT command used to create the indexes, with appropriate options passed (e.g depth)
Datasets
A set of data published by the spanish National Mapping Agency will be used. This data is free for non-commercial use, so it's perfectly OK to use it in the benchmark.
Vector data will be a topographic map, composed of several shapefiles (one shapefile per theme). Raster data will be 50cm/px aerial photography.
We still do not have the definitive datasets, due to their size and the processing time needed to put them together. You can get some sample data by reading Benchmarking 2010/How to get some sample data.
SVN
The project files (minus data) are stored in Subversion (http://svn.osgeo.org/osgeo/foss4g/benchmarking/).
Hardware
Machine A (windows server)
- System Type: Dell PowerEdge R410
- Ship Date: 7/7/2010
- Processor: Intel® Xeon® E5630 2.53Ghz, 12M Cache,Turbo, HT, 1066MHz Max Mem
- 8GB Memory (4x2GB), 1333MHz Dual Ranked RDIMMs for 1Processor, Optimized
- 2TB 7.2K RPM SATA
- OS: Windows Server 64bit
Machine B (linux server)
- System Type: Dell PowerEdge R410
- Ship Date: 7/7/2010
- Processor: Intel® Xeon® E5630 2.53Ghz, 12M Cache,Turbo, HT, 1066MHz Max Mem
- 8GB Memory (4x2GB), 1333MHz Dual Ranked RDIMMs for 1Processor, Optimized
- 2TB 7.2K RPM SATA
- OS: Centos 5.5 x86-64
Machine C (DB Host)
- System Type: Gateway E6610D
- Processor, Intel Core2 Duo - E6750 2.66 Ghz
- 4Gb Ram
- 250Gb Hard Drive 7200 rpm
- OS: Centos 5.5 x86-64
Machine D (testing server (JMeter))
- System Type: Dell Precision Workstation 390
- Ship Date: 9/7/2006
- Processor, 6300, 1.86, 2M, Core Duo-conroe, Burn 2
- 2Gb RAM
- 160 Gb Hard drive 7200 rpm
- Service Tag: 5Q5LQB1
- OS: Centos 5.5 i386
Communication
Coordination/communication is primarily via the Benchmarking mailing list: http://lists.osgeo.org/mailman/listinfo/benchmarking
Weekly meetings will occur through IRC chat in the #foss4g channel on irc.freenode.net
Next IRC meeting
- Wed August 18th @ 15:00:00 UTC
- Provisional Agenda:
- baseline test styling equivalence: are we there yet?
- starting and stopping all servers
- slotting times for each team to run benchmarks
- cleaning up the rendering benchmark blank tiles
- graphic access to the JMeter machine
- last testing date
- Provisional Agenda:
Previous IRC meeting
- Wed August 11th @ 15:00:00 UTC
- Provisional Agenda:
- Jmeter testing
- what output formats will be requested?
- what projections will be requested? (Platte-carré is not conformal)
- what layer combinations will be requested? (All vector layers or various mixed subsets?)
- what envelopes will be issued? (Frank's 2009 query generation script forces a hard cutoff at the largest scale)
- what output metrics will be created?
- how will the jmx file be designed? (need one .csv file per thread so all requests are different)
- Status report from each team
- Testing deadline: shall we extend it?
- Can all server provide links to the sample requests and allow to build more? (style comparison, are we all doing the same work?)
- Vector data
- Changes in rendering (missing values for some attributes)
- Gaps: what to do with the 9 missing sheets?
- Jmeter testing
- Summary: log
- Provisional Agenda:
- Wed August 4th @ 15:00:00 UTC
- Provisional Agenda:
- Machine status / issues
- Data status / issues
- Characters mis-encoded in shapefiles (e.g. catalan in labels): fixable? impact on PostGIS database?
- Raster conversion to GeoTIFF (torrent of results?)
- Styles status / issues
- Jmeter testing
- what output formats will be requested?
- what projections will be requested? (Platte-carré is not conformal)
- what layer combinations will be requested? (All vector layers or various mixed subset?)
- what envelopes will be issued? (Frank's 2009 query generation script forces a hard cutoff at the largest scale)
- what output metrics will be created?
- how will the jmx file be designed? (need one .csv file per thread so all requests are different)
- Summary: Benchmarking 2010 IRC meeting logs Aug 4th
- Provisional Agenda:
- Wed July 28th
(no info)
- Wed July 21st @ 15:00:00 UTC
- Agenda:
- update on data processing for all of Spain
- update on server setup
- update on data styling
- checkin on server installs / library dependencies
- Agenda:
- Wed July 14th @ 15:00:00 UTC
- Agenda:
- update on data processing for all of Spain
- update on server setup
- data styling discussions
- scales to use for layers
- missing full road network
- layer to use for labels along lines
- Summary: (log)
- Attendance:
- Jeff McKenna (jmckenna)
- Andrea Aime (aaime)
- Iván Sánchez (IvanSanchez)
- Michael Smith (msmith_)
- Dimitri Monie (dmonie)
- Dane Springmeyer (springmeyer)
- Discussion:
- data:
- full dataset of Spain, vector and raster, has been processed by IvanSanchez, and he is checking it and will create a torrent for transfer, by the end of the week
- because of shapefile size limit some layers may have to be split (contour-1.shp, contour-2.shp)
- servers:
- msmith has received both server machines
- T1 router is being installed, it will be a dedicated line for this project
- he is adding 2 TB drives to each
- layer styling:
- we notice that the Spain dataset we are using (BCN25) does not include street files (zoomed in road network)
- we agreed to use contours for curved label tests instead
- jmckenna will test the layer scales provided by Marco (QGIS mapserver)
- aaime pointed out problem with the posted SLDs, in terms of the label property (jmckenna to look into it)
- aaime pointed out that we should make sure that all servers compute the scales the same way
- OGC recommends 90 DPI (GeoServer, Mapnik use the recommended 90...MapServer defaults to 72, but should be configurable)
- all teams should look into how their engine handles this and report to group
- other:
- IvanSanchez reported that a Spanish custom-made WMS server may join the exercise (they access raster files only)
- dmonie pointed out that Erdas has been quiet lately because one of the team leaders is on holiday
- data:
- Attendance:
- Agenda:
- Wed June 30th @ 15:00:00 UTC
- Summary: log
- Wed June 30th @ 15:00:00 UTC
- Summary: log
- Wed June 23rd @ 15:00:00 UTC
- Agenda:
- update on proposed layer styling
- update on available server specs
- Summary: (log)
- Agenda:
- Wed June 16th @ 15:00:00 UTC
- Agenda:
- update on server infrastructure
- update on OSGeo-es dataset (Sample data download instructions)
- Summary: (log)
- Attendance:
- Jeff McKenna (jmckenna)
- Andrea Aime (aaime)
- Iván Sánchez (IvanSanchez)
- Michael Smith (msmith_)
- Cédric Briançon (cedricbr)
- Discussion:
- "base" run will use only shapefiles, "best" run will be any format (raster/vector)
- teams can optionally test Oracle, PostGIS, Ingres connections during the "best" run
- layers tested will be: labelled roads (with labels that follow the lines), thematic polygon map, and a test with labelled roads + thematic polygon + point layer in a single request (3 separate runs)
- msmith and jmckenna to examine styling for one sheet (Barcelona, sheet# 420)
- each base and best run will test data in raw projection (ETRS89 + UTM28/29/30/31) and then reprojected on-the-fly to the Google Mercator projection
- threads (number of concurrent requests) will be: 1-2-4-8-16-32-64
- US Army Corps is proposing to provide 4 desktop machines. 2 identical for testing ( win/lin ) and 1 for jmeter and 1 for databases
- msmith will find out availability of 2-cpu machines this week
- Attendance:
- Agenda:
- Wed June 2nd @ 15:00:00 UTC
- Agenda:
- team representative introductions
- Establish this year's 'Rules of Engagement' (previous RoE)
- Discuss dataset to be used (including how to share the large dataset, for running local tests)
- Discuss server infrastructure
- Discuss possible changes to the test script parameters
- Set next meeting date
- Summary: (log)
- Attendance:
- Jeff McKenna, MapServer (jmckenna)
- Cédric Briançon - Constellation (cedricbr)
- Michael Smith - Oracle Mapviewer and Oracle connections in MapServer (msmith_)
- Zac Spitzer - MapGuide (zacspitzer)
- Martin Daly - Cadcorp (mpdaly)
- Anne-Sophie Collignon - ERDAS (ascollignon)
- Marco Hugentobler - QGIS mapserver (marco___)
- Andrea Aime - GeoServer (aaime)
- Dimitri Monie - ERDAS (dmonie)
- Frank Warmerdam - MapServer, possibly Ingres as well (FrankW)
- Iván Sánchez - provide geodata from Spanish sources and help out Mapnik (IvanSanchez)
- Dane Springmeyer - Mapnik (springmeyer)
- Johann Sorel - GeotoolKit / Constellation (Eclesia)
- Adrian Custer - Constellation (acuster)
- Pirmin Kalberer - QGIS mapserver (pirmin_k)
- Trevor Wekel - MapGuide (trevorw)
- Daniel Morissette - MapServer (danmo)
- Discussion:
- All agreed to this year's "Rules of Engagement"
- All parties must contribute any changes that they make to their software for this exercise, back to their community. Note that the changes don't have to be contributed before the conference, just in a reasonable period of time.
- Comparisons will be made of the best available version of the software, be it a formal release or a development version.
- Two tests will be run: one 'baseline' test with the data in its raw format (with spatial indexes), and another 'best effort' test where 'the sky is the limit' for what changes you want to make to the data (change format, generalize, etc)
- Teams must document all steps they did to manipulate the data/server for both the 'baseline' and 'best effort' tests. If a team does not document the steps on this wiki then that team's test results will not be used.
- Data formats to be used will be shapefiles for vectors, and uncompressed geotiffs for rasters.
- WMS output formats to be used will be png8 and png24 where possible
- Dataset
- Iván Sánchez Ortega from OSGeo-es is willing to provide local Spain datasets
- data is for non-commercial use only
- 10 GB of shapefiles
- 100 GB of uncompressed rasters
- Server infrastructure
- possible hosts are msmith (US Army Corps) or Skygone (http://www.skygoneinc.com/)
- need two identical servers, one running Unix and another Windows, plus a client machine
- specs could be: quad core, 8GB of memory, 1TB hard drive
- need fast pipe to server, as large datasets will be transferred
- Next meeting date
- all agreed to use same day and time next week, but limit the meeting length to one hour
- All agreed to this year's "Rules of Engagement"
- Attendance:
- Agenda:
Sample Dataset Styling
- download sample dataset
- layer styling rules
Benchmarking setup
Tests are going to be run using JMeter, using a progression of 1, 2, 4, 8, 16, 32, and 64 threads, each thread group doing 100, 200, 200, 400, 400, 800 requests respectively, for a total of 2200 requests. The requests bounds and sizes are going to be picked up from a csv file generated by a random generator script.
The scripts to be run for the baseline tests, and repeated for the best effort one, are:
- raster data, JPEG output, EPSG:25831, scales between 1:1M and 1:10k
- raster data, JPEG output, EPSG:3857, scales between 1:1M and 1:10k
- vector data, PNG24 output, EPSG:4326, scales between 1:300k and 1:1k
- vector data, PNG24 output, EPSG:3875, scales between 1:300k and 1:1k
- raster + vector, JPEG output, EPSG:25831, scales between 1:300k and 1:1k
- raster + vector, JPEG output, EPSG:3785, scales between 1:300k and 1:1k
The areas to be tested are:
- 6 E, 38 N, -1 E, 43 N for vector tests
- 372360,4557880,484490,4631460 for raster tests
The raster layer in the baseline test is going to be the mosaic of GeoTIFF files Frank prepared out of the ECW files on the Linux server.
The vector data set is composed of a stack of individual layers to be stacked in the following order (bottom to top):
- settlements
- contour-0
- contour-1
- contour-2
- contour-3
- contour-4
- contour-5
- contour-6
- contour-7
- building
- industry
- track
- ramp
- road
- motorway
- point-labels-for-geometry
- point-labels-no-geometry
The best effort will use its own custom set of layers (e.g., it's likely contours will be merged into one, and so on)
Live Benchmark WMS GetMap Requests
Note: sample bbox's fixed on Aug 12 to fit the proper aspect ratio of an 800/600 image request (springmeyer). Previous bbox for barcelona request was '1.8,41,2.4,42', new one used below is: '1.43333333333,41.0,2.76666666667,42.0'. Previous request for focused view was '2.1076723642349,41.407828508849,2.1178733654021,41.414271246429', new one is '2.10767236423,41.4072245022,2.1178733654,41.4148752531'. This was done to sidestep a deficiency in mapnik for affine transformation of geometries according to the WMS spec when the image w/h aspect does not match the bbox aspect, but was agreed upon as useful for these demo links by jeff and andrea in IRC.
MapServer
- Windows 64bit Server
- Shapefiles, MapServer 5.7-dev (Barcelona extents)
- Shapefiles, MapServer 5.7-dev (with labelled contours)
- Linux
- Shapefiles, MapServer 5.7-dev (Barcelona extents)
- Shapefiles, MapServer 5.7-dev (with labelled contours)
- Shapefiles, MapServer 5.7-dev-label-fix (with labelled contours)
- Shapefiles, MapServer 5.7-dev-label-fix (labelled contours, larger view)
GeoServer
- Shapefiles GeoServer 2.1.x (Barcelona extents)
- Shapefiles GeoServer 2.1.x (with labelled contours)
- Shapefiles GeoServer 2.1.x (labelled contours, larger view)
- TIFF mosaic, zoomed in, GeoServer 2.1.x
- TIFF mosaic, mid zoom, GeoServer 2.1.x
- TIFF mosaic, whole area, GeoServer 2.1.x
Mapnik
- Shapefiles Mapnik2 / paleoserver (Barcelona extents)
- Shapefiles Mapnik0.7.2 / mod_mapnik_wms (Barcelona extents)
- Shapefiles Mapnik2 / paleoserver (larger view - labeled contours)
Constellation
- Shapefiles Constellation (Barcelona extents)
- Shapefiles Constellation (labelled contours, larger view)
Oracle MapViewer
- Oracle DB / MapViewer (Barcelona Extents)
- Oracle DB / MapViewer (with labeled contours)
Cadcorp GeognoSIS
- Shapefiles, GeognoSIS 7.0 (Barcelona Extents)
- Shapefiles, GeognoSIS 7.0 (with labelled contours - see note)
- PostGIS, GeognoSIS 7.0 (Barcelona Extents)
- PostGIS, GeognoSIS 7.0 (with labelled contours - see note)
- Oracle, GeognoSIS 7.0 (Barcelona Extents)
- Oracle, GeognoSIS 7.0 (no contours - under investigation)
- TIFF mosaic, zoomed in, GeognoSIS 7.0
- TIFF mosaic, mid zoom, GeognoSIS 7.0
- TIFF mosaic, whole area, GeognoSIS 7.0
- ECW, zoomed in, GeognoSIS 7.0
- ECW, mid zoom, GeognoSIS 7.0
- ECW, whole area, GeognoSIS 7.0
N.B. The current GeognoSIS Label Theme "label along lines" option does not work well with this data. We hope to have an alternative option ready in time for the tests
QGIS mapserver
- QGIS mapserver shapefile (Barcelona Extents)
- QGIS mapserver shapefile (with labeled contours)
- QGIS mapserver TIFF mosaic, zoomed in
- QGIS mapserver TIFF mosaic, mid zoom
- QGIS mapserver TIFF mosaic, whole area
Starting and stopping the various servers
- GeoServer on the Linux box:
- starting: /benchmarking/geoserver/start_geoserver.sh
- stopping gracefully: /benchmarking/geoserver/stop_geoserver.sh
- check it's really dead: ps aux | grep GEOSERVER . If it hasn't stopped gracefully use kill -9 <pid> to terminate the process
- MapServer and QGIS Mapserver on the Linux box (those two should probably be separated):
- starting: /opt/mapserver/bin/apachectl start
- stopping: /opt/mapserver/bin/apachectl stop
- Mapnik:
- paleoserver (standalone daemon):
- stopping: /opt/mapnik/paleoserver_stop.sh
- starting: no easy way for other user to start
- mod_mapnik_wms (runs within apache):
- starting: /opt/mapnik/mod_mapnik_start.sh
- stopping: /opt/mapnik/mod_mapnik_stop.sh
- paleoserver (standalone daemon):
- Mapviewer:
- starting: /opt/mapviewer/start_mv.sh
- stopping: /opt/mapviewer/stop_mv.sh
- check it's really dead: ps aux | grep java | grep mapviewer . If it hasn't stopped gracefully use kill -9 <pid> to terminate the process