Simple Catalog Interface

This page helps document a conversation in the Geodata Committee about developing simple protocols for discovering and syndicating metadata describing where geodata is available. The conversation is visible and joinable through the osgeo geodata mailing list (archives for Aug. 2006)


 * See Also: Geodata Metadata Requirements

Stefan Keller wrote an excellent overview describing some of the background to why there is a felt need for a simple metadata exchange/search interface

Requirements

 * we want a simple web interface
 * we want it to be as existing-standards-compliant as possible
 * we want it to be very easily syndicatable and contributable as possible
 * we not only want to register a static resource but also web services

Web addressable interfaces
http://www.gis.hsr.ch/wiki/OAI-PMH - Stefan Keller's comparison of OAI-PMH and WFS

Metadata models
see Geodata Metadata Requirements

Notes from Metadata + Catalog BOF at FOSS4G2006
stefan - ISO issues - need to describe services - being merged with ebRIM - compulsory core is 300 attributes - this is overkill for user+producer requirements


 * has a vision of how to overcome the problem

tom from owscat - hard time finding implementations of cat-2 that are available - simple requirement to discover service resources - is doing capabilities indexing and spitting out layer / index metadata - search clients are WFS clients

my spiel

schuyler - metacarta - interested in contributing solutions - + helping the geodata committee

adoyle - simple as possible, right as possible - search interfaces - human interfaces - how do text search indexes make it seem so easy - "how do we make it seem so hard?"

nedh - i'm not even sure what CSW is but i think that's what i'm interested in

Josh: CAT was its previous name - a semi-abstract spec - CSW was the description of the HTTP service. ebRIM - CSW - "the spec is big for a reason because it tries to address a lot of different communities" - corba binding, z39.50 binding.

OGC wants to contrib to simple discovery -

interested in simple profile development - work happening in geonetwork

Raj from OGC

tryign to catalogue everything - data, data services, symbols, so much process. 2 parts of simple answer -1/ do we only want to catalogue a class of thing? 2/ what comes out of the information model and letting the programmers develop

Rob Atkinson - can't build simple profiles - wants to catalogue all this kind of thing - too many implementations - little in the way of interop

"care about not the objects, but the relationships between them; this service implements this feature type". ability to *query* relationships - this is contained in the ebRIM metamodel

rob: "protocol is not the issue"

stef: "profile is not the issue"

getting the information model right

metadata slots bound to *vocabularies* that are domain defined - this sounds much more like OWL - "guide the user through the semantic space"

Jeroen - geonetwork opensource - macmini

tom from owscat - extending WFS - putting metadata model through it. wms 1.1.1 capabilities + wfs 1.0.0 metadata - everyone's waiting for geonetwork to implement this. wfs interesting engine for query. josh suggests small proxy to offer a csw interface over it - a geoserver module.

SLD to WMS - i care about the features that it support - this is what jody is saying - query via SLD so i know what to query for.

size of model. relationship issue. stef talks about OAI-PMH approach jeroen says geonetwork will soon support an OAI-PMH interface

then we still need to define the minimal model

jeroen - in UNEP - work on defining a metadata subset - avoiding later refactoring of metadata - concern about making new versions of existing standards.

"metadata exchange protocol"

data for internal and external use

internal model - smallest common denominator

cascading query services - lightweight interface at the indexing level.

integrating search services. google talks dublin core, talks oai-pmh

rob: flexibility of the meta model - we can't predict what we're going to receive or need to express. - no guarantee of consistency -

how do we agree to agree?

stef - it's not easy to chain services - discovery first, chaining afterwards.

lightweight protocol which doesn't include filter.

OAI for propagation - is the data useful when you propagate it - our information mode

Notes from Metadata + Catalog BOF at FOSS4G2006 (ff.)
Thank you, Jo, for the notes. I hope you are all safe back home from Switzerland! I uploaded my slides at FOSS4G2006. and I took again some time to define an minimal metadata information model regarding to a metadata exchange protocol. --SFK 11:06, 19 September 2006 (CEST)

Metadata information model (proposal)
Some design considerations:
 * This is a minimal metadata information model regarding to a metadata exchange protocol for harvesting (e.g. no filter nor GML implementation needed)
 * Based on Dublin Core (DC) and Catalogue Services Specification 2.0.1, OGC 04-021r3, p.22.
 * Dublin Core need refined semantics of some properties/attributes.
 * Have had hard times with the abundance use of namespaces. This is because DC specs and other XML 'practices' specialize properties/attribute types instead of specializing whole classes.
 * All properties/attributes have cardinality 1 except where really needed for automation!
 * Take all information one can in an automated manner, e.g. from data set resource.

Table:

Remarks:
 * DC attributes/properties left as they are...: Audience; Contributor; Creator.
 * All attributes/properties have cardinality 1 except dc:relation and dct:format.
 * No additinal DC attributes/properties required; few them needed to be specialized (see dct:...); still some attributes/properties need some specialized recommended meaning (see tbd.).
 * dct:modified and dct:spatial can be sync'ed from dataset.
 * Attribute 'relation': This was'nt discussed yet. Simply helps harvesters to discover more (meta) data providers.
 * See for some general explanations about dc/dct: http://cicharvest.grainger.uiuc.edu/qualifieddc.asp
 * Note that OAM-PMH puts a XML enveloppe around this metadata and adds a header containing two attributes: 'identifier' to identify an metadata record and 'datestamp' as date of last (published) change of metadata record.
 * Assume metadata (as opposite to geodata) is always free and open information.
 * An encoding still has to be discussed (see following example). need schemaLocation in OSGeo!?

Example
Notes:
 * Example values are purely fictive.
 * XML Schema (= geometadc.xsd) still tbd.
 * This record is not yet validated!
 * Took 'geometadc' as enveloppe name.

 f264-77d2-09ce-aa39-f0f0 National Elevation Mapping Service for Texas Elevation data collected for the National Elevation Dataset (NED) based on 30m horizontal and 15m vertical accuracy.  Elevation, Hypsography, and Contours f264-77d2-09ce-aa39-f0f0 grid geodata uri:http://www.gis.hsr.ch/wms uri:http://www.gis.hsr.ch/data/poi_data_rapperswil.shp</dc:format> <dct:modified>2004-03-01</dct:modified> <dct:spatial> <Box projection="EPSG:4326" name="Geographic"> 34.353        -96.223         28.229         -108.44       </Box> </dct:spatial> <dc:language>en</dc:language> <dc:source>lineage: ...</dc:source> <dc:rights>uri:http://www.usgs.gov/pubprod/</dc:rights> <dc:publisher>U.S. Geological Survey</dc:publisher> </dct:description>