Difference between revisions of "Simple Catalog Interface"

From OSGeo
Jump to navigation Jump to search
Line 131: Line 131:
 
* Have had hard times with the abundance use of namespaces. This is because DC specs and other XML 'practices' specialize properties/attribute types instead of specializing whole classes.
 
* Have had hard times with the abundance use of namespaces. This is because DC specs and other XML 'practices' specialize properties/attribute types instead of specializing whole classes.
 
* All properties/attributes have cardinality 1 except where really needed for automation!
 
* All properties/attributes have cardinality 1 except where really needed for automation!
* Take all information one can in an automated manner, e.g. from data set.
+
* Take all information one can in an automated manner, e.g. from data set resource.
  
 
Table:
 
Table:

Revision as of 01:26, 19 September 2006

This page helps document a conversation in the Geodata Committee about developing simple protocols for discovering and syndicating metadata describing where geodata is available. The conversation is visible and joinable through the osgeo geodata mailing list (archives for Aug. 2006)

Stefan Keller wrote an excellent overview describing some of the background to why there is a felt need for a simple metadata exchange/search interface

Requirements

  • we want a simple web interface
  • we want it to be as existing-standards-compliant as possible
  • we want it to be very easily syndicatable and contributable as possible
  • we not only want to register a static resource but also web services

Web addressable interfaces

http://www.gis.hsr.ch/wiki/OAI-PMH - Stefan Keller's comparison of OAI-PMH and WFS

Metadata models

see Geodata Metadata Requirements


Data exchange formats

Implementations

See Also


Notes from Metadata + Catalog BOF at FOSS4G2006

stefan - ISO issues - need to describe services - being merged with ebRIM - compulsory core is 300 attributes - this is overkill for user+producer requirements

  • has a vision of how to overcome the problem

tom from owscat - hard time finding implementations of cat-2 that are available - simple requirement to discover service resources - is doing capabilities indexing and spitting out layer / index metadata - search clients are WFS clients

my spiel

schuyler - metacarta - interested in contributing solutions - + helping the geodata committee

adoyle - simple as possible, right as possible - search interfaces - human interfaces - how do text search indexes make it seem so easy - "how do we make it seem so hard?"

nedh - i'm not even sure what CSW is but i think that's what i'm interested in

Josh: CAT was its previous name - a semi-abstract spec - CSW was the description of the HTTP service. ebRIM - CSW - "the spec is big for a reason because it tries to address a lot of different communities" - corba binding, z39.50 binding.

OGC wants to contrib to simple discovery -

interested in simple profile development - work happening in geonetwork

Raj from OGC

tryign to catalogue everything - data, data services, symbols, so much process. 2 parts of simple answer -1/ do we only want to catalogue a class of thing? 2/ what comes out of the information model and letting the programmers develop

Rob Atkinson - can't build simple profiles - wants to catalogue all this kind of thing - too many implementations - little in the way of interop

"care about not the objects, but the relationships between them; this service implements this feature type" . ability to *query* relationships - this is contained in the ebRIM metamodel

rob: "protocol is not the issue"

stef: "profile is not the issue"

getting the information model right

metadata slots bound to *vocabularies* that are domain defined - this sounds much more like OWL - "guide the user through the semantic space"

Jeroen - geonetwork opensource - macmini

tom from owscat - extending WFS - putting metadata model through it. wms 1.1.1 capabilities + wfs 1.0.0 metadata - everyone's waiting for geonetwork to implement this. wfs interesting engine for query. josh suggests small proxy to offer a csw interface over it - a geoserver module.

SLD to WMS - i care about the features that it support - this is what jody is saying - query via SLD so i know what to query for.

size of model. relationship issue. stef talks about OAI-PMH approach jeroen says geonetwork will soon support an OAI-PMH interface

then we still need to define the minimal model

jeroen - in UNEP - work on defining a metadata subset - avoiding later refactoring of metadata - concern about making new versions of existing standards.

"metadata exchange protocol"

data for internal and external use

internal model - smallest common denominator

cascading query services - lightweight interface at the indexing level.

integrating search services. google talks dublin core, talks oai-pmh

rob: flexibility of the meta model - we can't predict what we're going to receive or need to express. - no guarantee of consistency -

how do we agree to agree?

stef - it's not easy to chain services - discovery first, chaining afterwards.

lightweight protocol which doesn't include filter.

OAI for propagation - is the data useful when you propagate it - our information mode

Notes from Metadata + Catalog BOF at FOSS4G2006 (ff.)

Thank you, Jo, for the notes. I hope you are all safe back home from Switzerland! I uploaded my slides at FOSS4G2006. and I took again some time to define an minimal metadata information model regarding to a metadata exchange protocol. --SFK 11:06, 19 September 2006 (CEST)

Metadata information model (proposal)

Some design considerations:

  • Based on Dublin Core (DC) and Catalogue Services Specification 2.0.1, OGC 04-021r3, p.22.
  • Dublin Core need refined semantics of some properties/attributes.
  • Have had hard times with the abundance use of namespaces. This is because DC specs and other XML 'practices' specialize properties/attribute types instead of specializing whole classes.
  • All properties/attributes have cardinality 1 except where really needed for automation!
  • Take all information one can in an automated manner, e.g. from data set resource.

Table:

Attr. name Cardinality Attr. type Explanation Status
dc:identifier [1] string Unique id to identify a resource (URI); see UUID but also OAI-PMH! tbd.
dc:title [1] string Title of the resource. Ok
dc:description [1] string A description of the resource (why dct:abstract?) Ok
dc:subject [1] string Could be ISO 19115 classification or keyw., comma separated?? tbd.
dc:relation [unbounded] URI Reference to other data providers or to ‘friends’ as indicated here. tbd.
dct:type [1] string Type of original resource, like vector, raster, grid geodata. tbd.
dct:format [unbounded] URI enum of‘http, ftp, WMS, WFS’ (= Well known data access services), ‘Filter Service’ else ‘WSDL’. tbd.
dct:modified [1] date Date of last (published) change of resource. (Automated sync. from dataset) Ok
dct:spatial [1] dcmiBox:Box with CRS (Automated sync. from dataset) Ok?
dc:language [1] enum RFC 1766 (ISO 639, followed optionally by country ISO 3166) Ok
dc:source [1] URL (preferred) or string Lineage information about the resource Ok?
dc:rights [1] URL (preferred) or string License information about the resource Ok?
dc:publisher [1] structure (refinement of string) Civic Address or URI to point to (xAL/KML?) tbd.


Remarks:

  • DC attributes/properties left as they are...: Audience; Contributor; Creator.
  • All attributes/properties have cardinality 1 except dc:relation and dct:format.
  • No additinal DC attributes/properties required; few them needed to be specialized (see dct:...); still some attributes/properties need some specialized recommended meaning (see tbd.).
  • dct:modified and dct:spatial can be sync'ed from dataset.
  • Attribute 'relation': This was'nt discussed yet. Simply helps harvesters to discover more (meta) data providers.
  • See for some general explanations about dc/dct: http://cicharvest.grainger.uiuc.edu/qualifieddc.asp
  • Note that OAM-PMH puts a XML enveloppe around this metadata and adds a header containing two attributes: 'identifier' to identify an metadata record and 'datestamp' as date of last (published) change of metadata record.
  • Assume metadata (as opposite to geodata) is always free and open information.
  • An encoding still has to be discussed (see following example). need schemaLocation in OSGeo!?

Example

Notes:

  • Example values are purely fictive.
  • XML Schema (= geometadc.xsd) still tbd.
  • This record is not yet validated!
  • Took 'geometadc' as enveloppe name.
 <geometadc:qualifieddc 
   xmlns:geometadc="http://www.osgeo.org/schemas/geometa/" 
   xmlns:dc="http://purl.org/dc/elements/1.1/" 
   xmlns:dct="http://purl.org/dc/terms/" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation="http://http://www.osgeo.org/schemas/geometa/ geometadc.xsd">
   <dc:identifier>f264-77d2-09ce-aa39-f0f0</dc:identifier>
   <dc:title>National Elevation Mapping Service for Texas</dc:title>
   <dc:description>Elevation data collected for the National Elevation
 Dataset (NED) based on 30m horizontal and 15m vertical accuracy.
   </dc:description>
   <dc:subject>Elevation, Hypsography, and Contours</dc:subject>
   <dc:relation>f264-77d2-09ce-aa39-f0f0</dc:relation>
   <dc:type>grid geodata</dc:type>
   <dc:format>uri:http://www.gis.hsr.ch/wms</dc:format>
   <dc:format>uri:http://www.gis.hsr.ch/data/poi_data_rapperswil.shp</dc:format>
   <dct:modified>2004-03-01</dct:modified>
   <dct:spatial>
     <Box projection="EPSG:4326" name="Geographic">
       <northlimit>34.353</northlimit>
       <eastlimit>-96.223</eastlimit>
       <southlimit>28.229</southlimit>
       <westlimit>-108.44</westlimit>
     </Box>
   </dct:spatial>
   <dc:language>en</dc:language>
   <dc:source>lineage: ...</dc:source>
   <dc:rights>uri:http://www.usgs.gov/pubprod/</dc:rights>
   <dc:publisher>U.S. Geological Survey</dc:publisher>
 </dct:description>