Difference between revisions of "Geodata Metadata Requirements"

From OSGeo
Jump to navigation Jump to search
(expanding on notes, links to graphs)
Line 55: Line 55:
 
== Completeness ==
 
== Completeness ==
  
Hard to describe.
+
Hard to describe, so not described yet. To be useful, this has to provide more than a free-text description (although that can also be an option).
  
 
== Publication date ==
 
== Publication date ==

Revision as of 14:44, 10 April 2006

Why this document exists

One goal of the Public Geospatial Data Project is to offer, in the future, a repository of reusable public geographic data that can support open source geospatial software projects, both inside and outside the foundation.

One big requirement for a potential Geodata Repository is that there be a well-defined baseline for metadata. This can be seen as a quality assurance effort - data won't be accepted without a certain amount of metadata.

The [US Federal Geographic Data Committee metadata standard http://www.fgdc.gov/metadata/] emphasises conformance, but doesn't emphasise exchangeability / reusability. There are some properties which it would be really useful to have - different distribution channels like WFS, bittorrent which have come into existence since FGDC was originally defined. For many elements, FGDC asks for full-text descriptions. More structure in descriptions would help with automating discovery or re-use.

This is a "straw-person" set of suggestions, and comment / additional references would be gratefully received.

Draft Metadata model

Graph illustrating a basic metadata model generated from an RDF model of what OSGeo Geodata Committee participants have identified as their core needs for metadata.

Data Set

title

Title of the data set. Corresponds to Dublin Core title

description

Text description of the data set. Corresponds to Dublin Core description element.

originator

Person

A person responsible for publication of the data set - name and contact email address. These properties are well-defined in the FOAF vocabulary.

Organization

A organization responsible for publication of the data set - name and contact email address. These properties are well-defined in the FOAF vocabulary.

Spatial Reference

Vector, Raster or Point data, as described in FGDC.

Data source

URL from which the data can be downloaded via different protocols

WFS

For Vector data in GML

WMS

For Raster data described in GML

License information

Emphasis on public geographic data licenses: PGL, possible LPGL, Public Domain, Creative Commons-type licenses

Completeness

Hard to describe, so not described yet. To be useful, this has to provide more than a free-text description (although that can also be an option).

Publication date

Corresponds to Dublin Core date: ISO compliant date of publication.

Time Period

start date and end date

single date

Spatial Domain

A lot of this can be inferred either using GDAL/OGR or collected from a WMS/WFS GetCapabilities. It would be nice to bypass human error on collecting this kind of metadata.

Extents

FGDC specifies north, east, west, south bounding co-ordinates. It doesn't specify a projection in which these should be described. For reasons of simplicity it could make sense to require these be in WGS84 (EPSG:4236) - for the same reasons GeoRSS decided to mandate WGS84, rather than complicate matters by dictating that people also specify an SRS.

Projection (Raster)

Original projection of the data (reference to an

Scale (Vector)

Taxonomy/Ontology

Currently undecided; would be good to refer this to current well-known thesauri for data themes.

References

Notes

metadata isn't an easy task. The balance between completeness and people simply ignoring to generate it...

I wish I had had a prexisting plan of how to index and search for the data sets on extent and 'type' that we were adding


See Also