Benchmarking 2010/How to get some sample data
From OSGeo Wiki
How to download sample data
The datasets will require some time to be merged and prepared, and the simbology still isn't fully decided on. In the meantime, you can download some sample data by following these steps:
- This is the website where the Spanish National Mapping Agency publishes some datasets
- On the top of the page, hit the Welcome link to enter the english version of the page
- Register for a new user. You'll need to enter the desired username, password, repeat password, full name and surname, email, and solve a captcha.
- Enter your account details to log in.
- Go to Advanced search
- In the "products" drop-down menu, select either BCN25-BTN25 for vector data, or Ortofoto PNOA Máxima Actualidad for aerial imagery.
- In the "MTN50 sheet number", enter any number between 1 and 1108.
- Barcelona is 420-421; Madrid is 559. You may want to check a sheet reference (caution: sheet reference is a 17MB image)
- Hit "See list of products". Select a file to download. You will see the license for the data, which boils down to "do not make any commercial use of this data" and "say that the data comes from the IGN"
- You will see a survey. Hit "No enviar y continuar con descarga"
- Hit "Click here yo begin download"
Raster data issues
Remember that the baseline of the OSGeo Benchmark for raster data are GeoTIFF files, but note you can only download ECW files. We will be converting the ECWs to GeoTIFFs.
There will be one big file for each one of the sheets in the following diagram. The area covers the whole Catalonia, which is about 32,114 km2 (12,399.3 sq mi):
Vector data issues
Once unzipped, a sheet of vector data contains lots of shapefiles. The shapefiles will be merged, and most probably only the most interesting themes will be used.
Please note that the shapefiles are in ETRS89 + UTM28N/29N/30N/31N. The final shapefiles will be in WGS84, so you're encouraged to reproject them if you think it's neccesary for your testing.
A letter after the theme code explains what kind of geometry the shapefile contains. P stands for Point; L stands for Line; S stands for Surface (Polygon).
Only the following layers will be used:
This is by no means neccesary, but you might receive a penalty* if your rendering hits any of the following common pitfalls:
And, in order to make things more interesting, the team which renders the map in the most beautiful way will get one metre of beer for free (probably in the "no sweat" pub)
- Overlapping labels. Your software has text collision detection, hasn't it?
- Anti-alias. Who does not use antialias nowadays?
- Labels on linear features (e.g. river names, motorway names, contour elevation values) should follow the line geometry. Every letter tilted at the same angle = failure.
- Automatically cast float or double-float fields to int (e.g. the COTA_201 field should render as "310" instead of as "310.000000"). This is currently a bug on mapserver and mapnik.
- Rendering any road casing on top of any road fill. This means that the fill of a road link has precedence over the border of a motorway.
In an ideal world where money grows on trees and the streets are made out of candy, the development teams should have unlimited time to work on the map rendering.
Ideally, the rendering should be the same as the official topographic maps. You can download samples from the IGN (just follow the above instructions, but select MTN25RASTER instead of BCN25/BTN25).
As you may see, the rendering rules can become very complex, and be non-intuitive at first. This is a performance benchmark, not a beauty one, so teams should focus on performance, and not pixel-to-pixel accuracy.
For reference, the legend of the topo map looks like this: