Response to INSPIRE Metadata Draft


 * See Reading the INSPIRE Metadata Draft for preparatory material.
 * See FOSS SDIC

Any member of the Free and Open Source Geospatial software community is welcome to participate in creating this response. Please add your name, and contact details on your userpage, to the list of participants and at any key stage (first stable draft; when sending the response).

Due 15.03.2007? -- Stefan 19:11, 15 March 2007 (CET)

I am sure it was the 30th, the NAP draft was due the 15th?

Right now i am trying to get the docs for the form of the response out of the JRC and change our contact dets for the SDIC. -- Jo 23-03

The deadline is definitely the 30th March, thankfully. Comments must be submitted in this Excel template: http://www.ec-gis.org/inspire/sdic_call/templates/metadataIRComments.xls

The form for each comment is:

Chapter, section or clause no. | Paragraph/Figure/Table | Type of comment | Comment justification for change) | Proposed change

So i am working on integrating Ian's and Jan's comments and reconnecting mine to chapter, verse and clause. I planned to put it all into the Excel template this evening, then i discovered that the template doesn't work properly in OpenOffice. So i'm just going to dump a pipe-delimited version of what is collated into this wiki page for others to see, and worry about getting it into the proper format later in the week :/ I realise it makes sense to do that anyway because there is a lot to do in terms of structuring to get as much directly connected to the draft as possible... What is there now is only a start. :/

-- Jo 26-03

Just to mention i am doing another batch this evening and will upload them either later or first thing a.m. Meanwhile the JRC replaced the spreadsheet template with one that works, which is great.

-- Jo 28-03

= Participants =


 * Jo Walsh
 * Markus Neteler
 * Ian Ibbotson
 * Jan Růžička
 * Stefan Keller

= Responses =

Note: these will be automatically re-sorted by chapter, clause etc order when they are submitted so there's no priority ordering and not much point in number ordering...

General responses that are harder to "fit" but should be offered.
{|
 * 5.2 || || T || Section 5.2 "Discovery metadata elements" starts to set out a list of concepts seen to be (The document hints at, but does not directly say) core to the discovery process. Section 5.3 then sets out "Abstract discovery metadata element set". I *guess* the implication is that the concepts laid out in 5.2 are in some way even more abstract than those set out in 5.3. The document really isn't clear about what the abstract model is, or what it is for, before it starts enumerating the concepts. Your later comment about being tied to web services is spot on also here, I'm really not sure "Service type version", "Operation name" and "Distributed computing platform" belong in an abstract discovery model (The probably *do* belong in some result record schema). These three attributes seem to belong specifically to a particular service binding (Or as already said, to a very specific kind of returned result record). What I'd really like to see is a much clearer statement of what the purpose of the abstract discovery model is. Hopefully, once that is tightly defined, it should become easier to decide what lies inside the boundary of the abstract model, and what belongs in the domain of specific realisations of the abstract model. In the information retrieval community generally, it's considered really important to have a seperate abstract model for discovery (The search access points) and then bind that model on to as many backend schemas as needed.. this decoupling is seen as best practice in the information retrival domain, and most of my concerns here are that because of the apparent 1:1 mapping between the abstract model and the implementation. This is the approach taken in the [[Z3950 GEO profile] http://www.blueangeltech.com/standards/GeoProfile/geo22.htm] || Add an explanation, or a reference to coverage in the discovery services draft, of the abstract discovery model.
 * 5.3.2 || || T || 5.2.2. Talks about what I would expect to see from a temporal reference, but 5.3.2 maps temporal reference on to "One of the dates of publication, last revision or creation of the resource". These three elements are already well defined by Dublin Core attributes... Maybe I've misunderstood whats implied by table 1 in 5.3.2. Also, similar issues to the spatial access point arise (With structured data, as opposed to text queries). In some UK datasets, periods such as "Neolithic" can be used instead of an ISO 19108 Date Time. (I seen note 11 under 5.3.4 talks about this, which is good. Whats important is that regardless of the outcome of the study, the IR are extensible enough to cope with the eventual decision). I'd consider seperate access points for controlled vocabulary time period and structured temporal data. This seems a specific example where the abstract IR model needs to go beyond what is defined in the A2 binding || What is indicated by a Temporal Reference should be consistent between 5.2.2 and 5.3.2, or clearer examples offered in the Annexes.
 * A || || T || What are the expected semantics of resource language are on retrieval of language-neutral data sets.... Should a result record not be selected because the user specified "Nor" as the search language, but resources matching other criteria (Geo Extent for example) do match. Normally in Info Retrieval the answer is "no", but this is less clear when result records aren't primarily "Text" based. (Actually, this is a slightly wider concern about Annex A and those "CharacterString" elements... In IEEE LOM for example we have "LangString" element that has a "Lang" attribute. That community chose to allow language variants of a resource to be expressed within one record by allowing an element to hold all language variants. The presence of a "Lang" attribute at the "Dataset" level might mean the intention is to support multi-language datasets by having several dataset records, one for each language, which is OK, but possibly not optimal for datasets that aren't prmarily language based. If this is the case, is the "CharacterString" element in Annex A just redundant payload?) || Add a treatment in Annex I that describes the specifics of search across multi-language elements; reconsider the expression of multilanguage CharacterStrings in the light of information retrieval best practise (see comment on this change)
 * 5.3.2 || || T || 5.2.2. Talks about what I would expect to see from a temporal reference, but 5.3.2 maps temporal reference on to "One of the dates of publication, last revision or creation of the resource". These three elements are already well defined by Dublin Core attributes... Maybe I've misunderstood whats implied by table 1 in 5.3.2. Also, similar issues to the spatial access point arise (With structured data, as opposed to text queries). In some UK datasets, periods such as "Neolithic" can be used instead of an ISO 19108 Date Time. (I seen note 11 under 5.3.4 talks about this, which is good. Whats important is that regardless of the outcome of the study, the IR are extensible enough to cope with the eventual decision). I'd consider seperate access points for controlled vocabulary time period and structured temporal data. This seems a specific example where the abstract IR model needs to go beyond what is defined in the A2 binding || What is indicated by a Temporal Reference should be consistent between 5.2.2 and 5.3.2, or clearer examples offered in the Annexes.
 * A || || T || What are the expected semantics of resource language are on retrieval of language-neutral data sets.... Should a result record not be selected because the user specified "Nor" as the search language, but resources matching other criteria (Geo Extent for example) do match. Normally in Info Retrieval the answer is "no", but this is less clear when result records aren't primarily "Text" based. (Actually, this is a slightly wider concern about Annex A and those "CharacterString" elements... In IEEE LOM for example we have "LangString" element that has a "Lang" attribute. That community chose to allow language variants of a resource to be expressed within one record by allowing an element to hold all language variants. The presence of a "Lang" attribute at the "Dataset" level might mean the intention is to support multi-language datasets by having several dataset records, one for each language, which is OK, but possibly not optimal for datasets that aren't prmarily language based. If this is the case, is the "CharacterString" element in Annex A just redundant payload?) || Add a treatment in Annex I that describes the specifics of search across multi-language elements; reconsider the expression of multilanguage CharacterStrings in the light of information retrieval best practise (see comment on this change)
 * A || || T || What are the expected semantics of resource language are on retrieval of language-neutral data sets.... Should a result record not be selected because the user specified "Nor" as the search language, but resources matching other criteria (Geo Extent for example) do match. Normally in Info Retrieval the answer is "no", but this is less clear when result records aren't primarily "Text" based. (Actually, this is a slightly wider concern about Annex A and those "CharacterString" elements... In IEEE LOM for example we have "LangString" element that has a "Lang" attribute. That community chose to allow language variants of a resource to be expressed within one record by allowing an element to hold all language variants. The presence of a "Lang" attribute at the "Dataset" level might mean the intention is to support multi-language datasets by having several dataset records, one for each language, which is OK, but possibly not optimal for datasets that aren't prmarily language based. If this is the case, is the "CharacterString" element in Annex A just redundant payload?) || Add a treatment in Annex I that describes the specifics of search across multi-language elements; reconsider the expression of multilanguage CharacterStrings in the light of information retrieval best practise (see comment on this change)