From OSGeo
Revision as of 05:39, 29 May 2007 by Wiki-JoWalsh (talk | contribs) (moving contents of the original into the talk page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

OAI-PMH - stefan, why don't we want to require GetRecord ??

Just thought, ListRecords (incl. temporal query parameters) would be enough for harvesting. But I don't mind. -- Stefan 09:30, 15 March 2007 (CET)


My current thinking after talking to pramsey about Simple Catalog Interfaces is to separate out interfaces for different classes of data search / retrieval tasks. According to this narrative: data, services (including presentation) and "relationships" are three different classes of thing which CSW/ebRIM is trying to treat of all at once and that is why it is being so slow and overcomplex.

Service metadata

Many people are more interested in repositories of information about web services. They want to do realtime access "find-bind".

Package metadata

Right now this is the domain of things like shape files, GML files etc. Theoretically it could be any resource you visit and the resource doesnt change between visits. This also applies to local filesystem and data management.

Publishing metadata

Other people are more interested in the publish side - real data syndication between repositories - the kind of thing that OAI-PRE will one day support. In the meantime we have to make something out of what we have got. Jeroen writes,

For the harvesting of the catalog itself, GeoNetwork has a custom
interface/ process. This process will go out to some XML service
(provided by another GeoNetwork node at this stage) and will request
a very brief result set from the other catalog that contains file
identifier (UUID), time stamp and catalog ID.  By comparing those
with its internally cached records it will decide to request new or
updated records and it will remove those not found anymore.


Implementations of OAI-PMH

Each entry has a name of the implementation, whether it's a software or a webapplication (or an instance of both), progamming language and an example weblink). Some are using oai_dc 'DClite4G style'.


Metadata Providers:

Candidate Projects (harvesters and metadata providers):

  • uDig (Provider+Harvester; Desktop Software/Open source; Lang. Java; Ex. tbd.)
  • OSGEO (Provider; Instance/Open Source; Lang. C+Scripts; Ex. tbd.)
  • Canton of Aargau? (Provider; Instance/Closed source; Lang .NET; Ex. tbd).

DCLite4G is an effort to

  • establish a common information model with mappings to well known formats for geospatial metadata
  • provide a vocabulary/recommendations for spatial extensions to Dublin Core
  • support simple interfaces for collecting and querying geospatial metadata

Getting involved

To get involved first of all subscribe to the Mailing List (see below). Then read all corresponding pages of this Wiki. If you find that something is missing or incorrect, correct or add it as required. To do this you need to create an account for this Wiki. Sorry, still working on Single Sign On...

Once you get the hang of it and feel like needing more involvement you can join any interest group or project and contribute to the process. This can happen from many perspectives - as a user, portal operator, developer, decision maker or name it.

Mailing List

Feel welcome to join the discussion and development mailing list:


  • extract more of the specifics from Geodata Metadata Requirements. (This has changed a lot from the original version in the history & current version which has data sets differentiated from data sources, which may be files or databases or services)
  • update docs on Geodata Metadata Model
  • draw some UML to appease people?

Information Model

Core Model


Dublin Core

The Dublin Core Metadata Initiative is an open organization engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. An introduction to the topic can be read in the Usage Guide.




The dclite4g namespace provides a common naming scheme for properties specific to geographic data sets and services which are not covered by the existing Dublin Core or GeoRSS standards.

The namespace will come to live at .

  • extents
  • accuracy
  • scale

fill in here from Geodata Metadata Requirements


Query Interface


The Harvesting Protocol (version 2.0) specification together with Implementation Guidelines

Following are specific guidelines for a minimal OAI-PMH implementation of a so called 'data provider' using only the mandatory 'unqualified' Dublin Core (DC):

  • Only three operations (verbs) are needed: Identify, ListMetadataFormats and ListRecords.
  • Following operations are not required (initially): ListIdentifiers, ListSets, GetRecord.
  • No incremental harvesting (resumption process for ListXxx operations with more than 1000 records)
  • No compression as defined in the OAI-PMH spec. (compression at lower http level still possible)
  • Date granularity may be 'day' not seconds (YYYY-MM-DD)
  • Keeping track of deleted record may not be supported (deletedRecord=no)
  • Mandatory DC supported as data model is sufficient for a start but with specific semantics (e.g. coverage, relation) (see also example below):
    • dc:description contains dct:abstract
    • dc:coverage contains bounding box encoding as defined in
    • dc:date means in fact dct:modified
    • dc:relation is filled in with dclite4g:onLineSrc. If dc:type='service' dct:hasPart can be derived from GetCapabilities.

Additional bounding box query property for ListRecords.



See Also