Geodata Repository

From OSGeo
Revision as of 01:55, 3 November 2008 by Neteler (talk | contribs) (WMS + WFS + TileCache services)
Jump to navigation Jump to search

The notes on the Talk:Geodata_Repository Talk page for this page describe the background to this effort

A full list of suggestions for public domain data sets that are nice-to-haves is maintained at Geodata Discovery Working Group.

Getting involved

The geodata repository has a dedicated blade server at telascience - cf SAC Service Status. Right now this is geodata.telascience.org ; we need to get an osgeo.org domain pointed to it. The plan is to have something very much up, running and demoable by FOSS4G.

How you can help

  • Offer feedback on the Geodata Metadata Requirements - if you have a dataset you would like to contribute, does this model express it adequately?
  • Write or contribute an example Simple Catalog Interface
  • If you would like to install software and/or data on the repository machine, please talk to the good people at SAC or visit #osgeo and #telascience on irc.freenode.net to find out how to get an account and get started

Who is involved now

  • Jo Walsh is working on the backend data store and machine level metadata interfaces (also presenting this material at FOSS4G 2006)
  • Chris Schmidt is helping with an OpenLayers-based browsing interface
  • Schuyler Erle is helping with web service / application glue
  • Sean Gillies' OWSLib is, half-beknownst to him, a big part of all these applications - see the OWSLib API user stories at http://trac.gispython.org/projects/PCL/wiki/OwsCapabilitiesUserStory
  • Markus Neteler is contributing data processing code and resources as part of the GRASS data packaging effort
  • Norm Vine is providing a point of contact for administering the system and general reality-checking to all involved.
  • Martin Spott maintains the content of the PostGIS-repository on 'zuluviz'.

Interface Design

In order to be useful to people, a geodata repository needs a way for users to become quickly + easily informed as to what kind of data they're getting: metadata about the data, and the quality/quantity of data involved.

For Raster images, one way to do this is to include an extent-wide image of the data -- http://openlayers.org/gallery/ has screenshots which show some of what I mean where, although more directed towards applications. By having a whole-extent screenshot overview, users can quickly and easily see what they're getting. Additionally, assuming this data is to be made available as WMS, setting up an OpenLayers instance to allow users to browse the rasters and see at a more detailed level would be beneficial.

Vector data/attribute data would also need to be described, in text or some other way. http://freemap.in/world/ is an example of a browsable map which displays attribute data -- clicking on a country fills the sidebar with data. This interface was set up in about 20 minutes, and given decent map files, this kind of set up could be largely automated, allowing users to (again) get an overview of the data they're looking at before they download it or set it up on their own servers.

Data sources

Sources of public geodata for initial setup in a repository.

PostGIS serving vector data

There's a PostGIS server (thanks to John Graham for hosting this site at TelaScience) which stores different flavours of vector data; also known as "Landcover-DB". Current datasets include: VMap0, VMap1, AptNav, GSHHS, PGS, SWBD, MGRS, TIGER, StatsCan, OSM, GeoNames, CountryCodes and FGSODB - see below for further explanation.

Access - how to get to it !?

The web mapping colour schema has been applied in accordance with the Corine Land Cover project.
  • Direct read access to the database is available to users of the 'geodata.telascience.org' system. Retrieval in SQL syntax as "SELECT asText(wkb_geometry) FROM <Tablename> WHERE <Column> LIKE '<Keyword>'".
  • Collaborative editing via FeatureServer is being prepared.
  • Dump into Shapefiles provided occasionally or upon request; regular schedule possible if required.
  • Other data access upon request, depending on the purpose and use.

CAVEAT We regret that, due to technical reasons, direct PostGIS database access is currently unavailable.

On Offer !

Short explanation of available datasets (to be extended - the number of datasets as well as their explanation ;-):

Name Description # of layers
VMap0 Selected subsets of Vector Smart Map Level 0 polygons, lines and points, starting with a selection that has proven to be useful for creating FlightGear Scenery from it. Added a 'geonameid' column for joining urban areas with GeoNames (see below). Current details explained at the World Custom Scenery Project, will get synced some day. 33 (DETAIL)
VMap1 First attempt of a selection that would be "nice to have" for FlightGear from Vector Smart Map Level 1 - and certainly for other purposes as well. Added a 'geonameid' column for joining urban areas with GeoNames. Details similar to VMap0. 58 (DETAIL)
AptNav Geometric average of runway center locations plus runway/taxiway shapes as used by the FlightGear and X-Plane flight simulators; data taken from Robin Peel's Airport Database. Locations converted to OGC-style POINT geometries. Use 'icao' column for searching.
  • This import is currently tied to the state of the FlightGear 1.0.0 Base Package release.
1 (DETAIL)
GSHHS Global Self-consistent, Hierarchical, High-resolution Shoreline Database 1.6 shorelines. 4 (DETAIL)
PGS NGA Prototype Global Shoreline. 1 (DETAIL)
SWBD SRTMv2 Water Body Data. 1 (DETAIL)
MGRS Military Grid Reference System, alias UTMREF. 1 (DETAIL)
TIGER Topologically Integrated Geographic Encoding and Referencing system line data. Roads, railroads and water/stream line data from the 2006se release, water body and landmark polygons from 2005fe (thanks to Chris Holmes at The Open Planning Project for providing pre-processed data). 6 (DETAIL)
StatsCan Line data of the Statistics Canada 2006 Road Network File. 1 (DETAIL)
OSM OpenStreetMap Import of the planet dump. Split up into 23 different road-, railroad- and stream-layers; schema taken from Highway Map Features.
  • This import is done manually - typically the weekend after a new planet dump is being made available.
6 (DETAIL)
GeoNames Complete content of the "allCountries" export table from the Geonames.org geographical database (as of 2008-08-08). Locations converted to OGC-style POINT geometries. Added a 'pplkey' column for searchable classification of size for populated places [1-7]; schema proposed by Markus Neteler:
  • continental scale (>= 1:50 million): >= 1 million inhabitants
  • multi-national scale (>= 1:10 million): 500000-1 million inhab.
  • country scale (>= 1:1 million): 100000-499999 inhab.
  • regional scale (>= 1:500000): 50000-99999 inhab.
  • city scale (>= 1:50000): 10000-49999 inhab.
  • local scale: < 10000
6 (DETAIL)
CountryCodes Translation table for country codes as proposed here (thanks to Silke Reimer for preparing the table). 1
FGSODB This is the primary location of the FlightGear Scenery Models Repository; models consist of AC3D geometries, RGB/PNG textures and in some cases animations that are defined in XML wrappers. Locations in OGC-style POINT geometries. 1 (DETAIL)
SPECIAL Landsat7 Landuse data at selected areas (example) has been auto-classified from Landsat7-images and converted into suitable polygons at the World Custom Scenery Project. More to come.
  • This import is done manually whenever new data is being mada available.
18 (DETAIL)

Many thanks to Norman Vine for running the HCRA, the "Human Communications Relay Agent" :-)

Procedures ....

How the data gets in

  • Many datasets are readable with the GDAL/OGR(/OGDI) toolbox. Notably these are the VMap datasets, GSHHS, PGS, SWBD and MGRS. The 'ogr2ogr' command is used here, hidden behind a somewhat complex contruct of shell scripts which automates the whole thing. Writing to the DB is accomplished by the PostGIS driver provided by 'ogr2ogr'.
  • Some other datasets are automagically or manually (for those sets which are not expected to change often) transformed into OGC-compilant SQL scripts and run through the respective SQL monitor.
  • Few datasets are being imported right into the DB with Perl/DBI. AptNav for example is parsed from a text file with a home-grown parser in Perl and written to the database table with DBI.
  • Everything that is meant to represent a geometry is stored in the DB using OCG-compilant geometry types POLYGON, LINESTRING and POINT. A POINT for example is written into the DB for example as "INSERT INTO <Tablename> ([...]) VALUES ([...] PointFromText('POINT(<Lon> <Lat>)', 4326) [...])" and stored internally in the respective, geospatially searchable geometry data type.

TODO

Phase 1

  • Import data from many different sources and shape it into a unified format - first results available (see above).
  • Retrieve exact locations of major river dams and waterfalls (keywords: St. Lawrence, Niagara, Bosporus, Gibraltar, ....).
  • Build joins between VMap0/1 urban areas and GeoNames populated places (via names and geographic vincinity).

Phase 2

  • Generation of static per-country and per-region shapefiles; distribution via HTTP and via geotorrent.org.

Phase 3

  • Design and implement a storage-/data-model for road data that is capable of serving the needs of OpenStreetMap while remaining conformant to OGC-standards. Merge the ideas explained in the OSM New Data Model paper as well as Schuler's OSM on PostGIS initiative - in other words: Try "squaring the circle" ;-)
  • Now that OpenStreetMap has (also) imported TIGER, convince OSM to imcorporate an equivalent of VMap0 at places that are currently not covered, finally to create a global road network of maximum detail and accuracy.
  • Merge landcover- and stream-layers from VMap1, TIGER/OSM and Landsat7-classification into the foundation of VMap0 to create a global landcover dataset of maximum detail and accuracy.

Sequence undefined

  • Add proper metadata according to OSGeo metadata recommendation/standard.
  • Find suitable desktop client to FeatureServer.
  • Add polygon data that's been automagically retrieved from Landsat7 images at the World Custom Scenery Project - procedure available.
  • Add polygon data that's been manually digitized from Landsat7 images.

What you can do

  • Help is very much appreciated to add a reasonable colour schema. Work is currently on its way to convert the map display over to the CORINE colour values but this schema doesn't cover all our needs.

DONE

  • Generation of shapefiles on-the-fly for user-defined region and layer(s). Use 'pgsql2shp' and consider the 2 GByte-per-file limit: Landcover Database Shapefile Download
  • Rebuild TIGER layers from 2006se release according to these instructions - mostly finished.
  • Limit the "viewing angle" of WMS/WFS-services in order to save the DB-servers' life .... Solved by setting MAXSCALE
for some layers.

Old status

Blue Marble NG

  • Mapserver as WMS

Status

SRTM

Status

Landsat-7

Status

  • Waiting for disk space to finish unpacking the raw data on zuluviz.sdsu.edu
  • Interim plan is to write a simple WMS wrapper script that generates a GDAL VRT to assemble composites on the fly

Seamless imagery

Status

WMS + WFS + TileCache services

Many layers (see list above) are served as WMS/WFS at http://mapserver.flightgear.org

   WMS: http://mapserver.flightgear.org/ms?Service=WMS&Version=1.1.1&request=GetCapabilities
   
   WFS: http://mapserver.flightgear.org/ms?Service=WFS&Version=1.0.0&request=GetCapabilities
   
   TileCache: http://mapserver.flightgear.org/tc

SDSU / OSGeo / FlightGear Landcover Database Shapefile Download

See http://mapserver.flightgear.org/download.psp

Metadata

See Also