Difference between revisions of "Talk:DCLite4G"

From OSGeo
Jump to navigation Jump to search
(moving contents of the original into the talk page)
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
OAI-PMH - stefan, why don't we want to require GetRecord ??
 
OAI-PMH - stefan, why don't we want to require GetRecord ??
 +
:Just thought, ListRecords (incl. temporal query parameters) would be enough for harvesting. But I don't mind. -- [[User:Sfkeller|Stefan]] 09:30, 15 March 2007 (CET) 
  
 
== Interfaces ==
 
== Interfaces ==
  
My current thinking is to separate out interfaces for different classes of data search / retrieval tasks. E.g. many people are more interested in repositories of information about web  services. They want to do realtime access "find-bind".
+
My current thinking after talking to pramsey about [[Simple Catalog Interface]]s is to separate out interfaces for different classes of data search / retrieval tasks. According to this narrative: data, services (including presentation) and "relationships" are three different classes of thing which CSW/ebRIM is trying to treat of all at once and that is why it is being so slow and overcomplex.
 +
 
 +
 +
 
 +
=== Service metadata ===
 +
 
 +
Many people are more interested in repositories of information about web  services. They want to do realtime access "find-bind".
 +
 
 +
=== Package metadata ===
 +
 
 +
Right now this is the domain of things like shape files, GML files etc. Theoretically it could be any resource you visit and the resource doesnt change between visits. This also applies to local filesystem and data management.
 +
 
 +
=== Publishing metadata ===
  
 
Other people are more interested in the publish side - real data syndication between repositories - the kind of thing that OAI-PRE will one day support. In the meantime we have to make something out of what we have got. Jeroen writes,
 
Other people are more interested in the publish side - real data syndication between repositories - the kind of thing that OAI-PRE will one day support. In the meantime we have to make something out of what we have got. Jeroen writes,
  
{{{
+
For the harvesting of the catalog itself, GeoNetwork has a custom
For the harvesting of the catalog itself, GeoNetwork has a custom
+
interface/ process. This process will go out to some XML service
interface/ process. This process will go out to some XML service
+
(provided by another GeoNetwork node at this stage) and will request
(provided by another GeoNetwork node at this stage) and will request
+
a very brief result set from the other catalog that contains file
a very brief result set from the other catalog that contains file
+
identifier (UUID), time stamp and catalog ID.  By comparing those
identifier (UUID), time stamp and catalog ID.  By comparing those
+
with its internally cached records it will decide to request new or
with its internally cached records it will decide to request new or
+
updated records and it will remove those not found anymore.
updated records and it will remove those not found anymore.
+
 
}}}
+
 
 +
== Implementations ==
 +
 
 +
=== Implementations of OAI-PMH ===
 +
 
 +
Each entry has a name of the implementation, whether it's a software or a webapplication (or an instance of both), progamming language and an example weblink). Some are using oai_dc 'DClite4G style'.
 +
 
 +
Harvesters:
 +
* Google (Lang. n/a; Ex. spec. http://www.google.com/sitemaps )
 +
* OAI Home - Tools (Lang. many; Home http://www.openarchives.org/tools/tools.html)
 +
* geometa.info (Lang. Java; Ex.: http://geometa.info )
 +
 
 +
Metadata Providers:
 +
* GeoNetwork? (Provider; Webapplication/Open source; Lang. Java+XSLT; Ex. tbd.)
 +
* GeoShop.com (Lang. C; Ex. http://urlx.org/ibbinfoshop.ch/1d987 )
 +
* Geometa-Ed. HSR (Lang. PHP; Ex. http://urlx.org/geometa.info/0363d )
 +
* Geometa-Ed. Chur (Lang. PHP; Ex. http://urlx.org/geometa.info/67333 )
 +
* List of OAI tools: http://www.openarchives.org/tools/tools.html
 +
* List of OAI data providers: http://www.openarchives.org/
 +
 
 +
Candidate Projects (harvesters and metadata providers):
 +
* uDig (Provider+Harvester; Desktop Software/Open source; Lang. Java; Ex. tbd.)
 +
* OSGEO (Provider; Instance/Open Source; Lang. C+Scripts; Ex. tbd.)
 +
* Canton of Aargau? (Provider; Instance/Closed source; Lang .NET; Ex. tbd).
 +
 
 +
'''DCLite4G''' is an effort to
 +
* establish a common information model with mappings to well known formats for geospatial metadata
 +
* provide a vocabulary/recommendations for spatial extensions to Dublin Core
 +
* support simple interfaces for collecting and querying geospatial metadata
 +
 
 +
== Getting involved ==
 +
To get involved first of all subscribe to the Mailing List (see below). Then read all corresponding pages of this Wiki. If you find that something is missing or incorrect, correct or add it as required. To do this you need to create an account for this Wiki. Sorry, still working on Single Sign On...
 +
 
 +
Once you get the hang of it and feel like needing more involvement you can join any interest group or project and contribute to the process. This can happen from many perspectives - as a user, portal operator, developer, decision maker or name it.
 +
 
 +
=== Mailing List ===
 +
Feel welcome to join the discussion and development [http://lists.eogeo.org/mailman/listinfo/dclite4g mailing list]:
 +
* http://lists.eogeo.org/mailman/listinfo/dclite4g
 +
 
 +
== TODO ==
 +
 
 +
* extract more of the specifics from [[Geodata Metadata Requirements]]. (This has changed a lot from the original version in the history & current version which has data sets differentiated from data sources, which may be files or databases or services)
 +
* update docs on [[Geodata Metadata Model]]
 +
* draw some UML to appease people?
 +
 
 +
= Information Model =
 +
 
 +
== Core Model ==
 +
 
 +
[[Image:Metadata.png]]
 +
=== Dublin Core ===
 +
[http://dublincore.org/ The Dublin Core Metadata Initiative] is an open organization engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. An introduction to the topic can be read in the [http://dublincore.org/documents/usageguide/ Usage Guide].
 +
 
 +
=== ISO19115 ===
 +
 
 +
=== FGDC ===
 +
 
 +
== Namespace ==
 +
 
 +
The '''dclite4g''' namespace provides a common naming scheme for properties specific to geographic data sets and services which are not covered by the existing Dublin Core or GeoRSS standards.
 +
 
 +
The namespace will come to live at http://xmlns.com/2006/dclite4g/ .
 +
 
 +
* extents
 +
* accuracy
 +
* scale
 +
 
 +
 
 +
''' fill in here from [[Geodata Metadata Requirements]] '''
 +
 
 +
== References ==
 +
 
 +
* [http://geometa.info/rappiinfo/wiki/index.php/DClite4G Dublin Core lite for Geo]
 +
* [http://portal.opengeospatial.org/files/?artifact_id=5929&version=2 OGC Catalog Services 2 Specification] '''6.3.3''', ''Core returnable properties''
 +
* [http://portal.opengeospatial.org/files/?artifact_id=12604&version=1&format=pdf OGC ebRIM profile of CSW specification], '''Appendix B.5, Table B.3''' - ''Slots defined in the Basic package''
 +
* [http://udig.refractions.net/docs/api-udig/net.refractions.udig.catalog/net/refractions/udig/catalog/IGeoResourceInfo.html  iGeoResourceInfo class in uDig]
 +
* [http://www.geodatacommons.umaine.edu/wpapers/CGD%20Metadata%20White%20Paper%20v2.pdf GeodataCommons Metadata Whitepaper]
 +
 
 +
= Query Interface =
 +
 
 +
== OAI-PMH ==
 +
 
 +
[http://www.openarchives.org/OAI/openarchivesprotocol.html The Harvesting Protocol (version 2.0) specification] together with [http://www.openarchives.org/OAI/2.0/guidelines.htm Implementation Guidelines]
 +
 
 +
Following are specific guidelines for a minimal OAI-PMH implementation of a so called 'data provider' using only the mandatory 'unqualified' Dublin Core (DC):
 +
 
 +
* Only '''three''' operations (verbs) are needed: Identify, ListMetadataFormats and ListRecords.
 +
* Following operations are not required (initially): ListIdentifiers, ListSets, GetRecord.
 +
* No incremental harvesting (resumption process for ListXxx operations with more than 1000 records)
 +
* No compression as defined in the OAI-PMH spec. (compression at lower http level still possible)
 +
* Date granularity may be 'day' not seconds (YYYY-MM-DD)
 +
* Keeping track of deleted record may not be supported (deletedRecord=no)
 +
* Mandatory DC supported as data model is sufficient for a start but with specific semantics (e.g. coverage, relation) (see also example below):
 +
** dc:description contains dct:abstract
 +
** dc:coverage contains bounding box encoding as defined in http://georss.org/simple.html#Box
 +
** dc:date means in fact dct:modified
 +
** dc:relation is filled in with dclite4g:onLineSrc. If dc:type='service' dct:hasPart can be derived from GetCapabilities.
 +
 
 +
Additional bounding box query property for ListRecords.
 +
 
 +
== Others ==
 +
 
 +
 
 +
== References ==
 +
 
 +
* [http://devgeo.cciw.ca/owscat/docs/index.html OWSCat]
 +
* [[Simple Catalog Interface]] - links to articles, etc
 +
 
 +
= See Also =
 +
 
 +
* http://www.geometa.info/ - German geospatial data search service using dclite4g + OAI-PMH.

Latest revision as of 04:39, 29 May 2007

OAI-PMH - stefan, why don't we want to require GetRecord ??

Just thought, ListRecords (incl. temporal query parameters) would be enough for harvesting. But I don't mind. -- Stefan 09:30, 15 March 2007 (CET)

Interfaces

My current thinking after talking to pramsey about Simple Catalog Interfaces is to separate out interfaces for different classes of data search / retrieval tasks. According to this narrative: data, services (including presentation) and "relationships" are three different classes of thing which CSW/ebRIM is trying to treat of all at once and that is why it is being so slow and overcomplex.


Service metadata

Many people are more interested in repositories of information about web services. They want to do realtime access "find-bind".

Package metadata

Right now this is the domain of things like shape files, GML files etc. Theoretically it could be any resource you visit and the resource doesnt change between visits. This also applies to local filesystem and data management.

Publishing metadata

Other people are more interested in the publish side - real data syndication between repositories - the kind of thing that OAI-PRE will one day support. In the meantime we have to make something out of what we have got. Jeroen writes,

For the harvesting of the catalog itself, GeoNetwork has a custom
interface/ process. This process will go out to some XML service
(provided by another GeoNetwork node at this stage) and will request
a very brief result set from the other catalog that contains file
identifier (UUID), time stamp and catalog ID.  By comparing those
with its internally cached records it will decide to request new or
updated records and it will remove those not found anymore.


Implementations

Implementations of OAI-PMH

Each entry has a name of the implementation, whether it's a software or a webapplication (or an instance of both), progamming language and an example weblink). Some are using oai_dc 'DClite4G style'.

Harvesters:

Metadata Providers:

Candidate Projects (harvesters and metadata providers):

  • uDig (Provider+Harvester; Desktop Software/Open source; Lang. Java; Ex. tbd.)
  • OSGEO (Provider; Instance/Open Source; Lang. C+Scripts; Ex. tbd.)
  • Canton of Aargau? (Provider; Instance/Closed source; Lang .NET; Ex. tbd).

DCLite4G is an effort to

  • establish a common information model with mappings to well known formats for geospatial metadata
  • provide a vocabulary/recommendations for spatial extensions to Dublin Core
  • support simple interfaces for collecting and querying geospatial metadata

Getting involved

To get involved first of all subscribe to the Mailing List (see below). Then read all corresponding pages of this Wiki. If you find that something is missing or incorrect, correct or add it as required. To do this you need to create an account for this Wiki. Sorry, still working on Single Sign On...

Once you get the hang of it and feel like needing more involvement you can join any interest group or project and contribute to the process. This can happen from many perspectives - as a user, portal operator, developer, decision maker or name it.

Mailing List

Feel welcome to join the discussion and development mailing list:

TODO

  • extract more of the specifics from Geodata Metadata Requirements. (This has changed a lot from the original version in the history & current version which has data sets differentiated from data sources, which may be files or databases or services)
  • update docs on Geodata Metadata Model
  • draw some UML to appease people?

Information Model

Core Model

Metadata.png

Dublin Core

The Dublin Core Metadata Initiative is an open organization engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. An introduction to the topic can be read in the Usage Guide.

ISO19115

FGDC

Namespace

The dclite4g namespace provides a common naming scheme for properties specific to geographic data sets and services which are not covered by the existing Dublin Core or GeoRSS standards.

The namespace will come to live at http://xmlns.com/2006/dclite4g/ .

  • extents
  • accuracy
  • scale


fill in here from Geodata Metadata Requirements

References

Query Interface

OAI-PMH

The Harvesting Protocol (version 2.0) specification together with Implementation Guidelines

Following are specific guidelines for a minimal OAI-PMH implementation of a so called 'data provider' using only the mandatory 'unqualified' Dublin Core (DC):

  • Only three operations (verbs) are needed: Identify, ListMetadataFormats and ListRecords.
  • Following operations are not required (initially): ListIdentifiers, ListSets, GetRecord.
  • No incremental harvesting (resumption process for ListXxx operations with more than 1000 records)
  • No compression as defined in the OAI-PMH spec. (compression at lower http level still possible)
  • Date granularity may be 'day' not seconds (YYYY-MM-DD)
  • Keeping track of deleted record may not be supported (deletedRecord=no)
  • Mandatory DC supported as data model is sufficient for a start but with specific semantics (e.g. coverage, relation) (see also example below):
    • dc:description contains dct:abstract
    • dc:coverage contains bounding box encoding as defined in http://georss.org/simple.html#Box
    • dc:date means in fact dct:modified
    • dc:relation is filled in with dclite4g:onLineSrc. If dc:type='service' dct:hasPart can be derived from GetCapabilities.

Additional bounding box query property for ListRecords.

Others

References

See Also