European Commission logo
INSPIRE Community Forum

Issues with INSPIRE-theme classification in GeoNetwork

Dear All,

Our SDI catalogue is based on GeoNetwork version 3.0.6. - https://ide.cat/en/catalogue/ 

After some months operating with this catalogue we discovered some strange results in the INSPIRE-theme classification that this catalogue application automatically performs after indexing the catalogue content.

In particular, these results appear in the "Get Started" (https://ide.cat/geonetwork/srv/eng/catalog.search#/home) and "Search" (https://ide.cat/geonetwork/srv/eng/catalog.search#/search - concretely in the facets which let filtering by "INSPIRE Themes") pages of the catalogue user interface.

The issue is that some data sets gets classified according a specific INSPIRE theme although its metadata does not contain the corresponding keyword from the "GEMET - INSPIRE themes" thesaurus.

EXAMPLE: "WMS Nivological and avalanche information" metadata:

Metadata simplified view: 

https://ide.cat/geonetwork/srv/eng/catalog.search#/metadata/5160f3ff-7d34-4711-871b-dfdaca35202a

XML view: 

https://ide.cat/geonetwork/srv/eng/xml.metadata.get?id=500703

The full XML may be accessed here (attached in 'Group files' in this platform):

https://inspire.ec.europa.eu/forum/file/view/263423/xml-file-to-show-incorrect-inspire-theme-classification-in-geonetwork

Despite this metadata has no keywords from the "GEMET - INSPIRE themes" thesaurus, it gets classified (and indexed) into the "Geology" INSPIRE Theme probably because GeoNetwork finds a regular text keyword matching to the name of this theme:

image
 

In our view this behaviour is not appropriate because a metadata file could contain keywords matching (or partially matching) with an INSPIRE theme name (e.g. may have been related to a theme because of specific use-cases), although the data set actually belongs to another INSPIRE theme.

Any views on this? Similar experiences? How to resolve it?

Thanks in advance!

Jordi

  • Paul van Genuchten

    By Paul van Genuchten

    Hi Jordi, I duplicated your issue to https://github.com/geonetwork/core-geonetwork/issues/4182 so also other developers are aware. The issue you mention has come to existence over time. In previous iterations the inspire keyword didn't always have a thesaurus, that's probably why any keyword is indexed as potential inspire theme. Also consider the generic discovery case, if you're looking for geology data, also non inspire datasets could be relevant to you. In that case the inspire theme is more used as a categorisation.

    If you want to make the discovery of datasets by theme more strict, you can alter how keywords are indexed in https://github.com/geonetwork/core-geonetwork/blob/fd44c1fa14d818e6272e97b90f085fa98370fbe1/schemas/iso19139/src/main/plugin/iso19139/index-fields/default.xsl#L318. From that code my impression is that in more recent versions of geonetwork the situation has already improved.

    Hope this helps to find a solution

     

     

  • Jordi ESCRIU

    Hi Paul,

    Many thanks for your response.

    We would like to apply (in the short term) some improvements / restrictions on the fields that out SDI catalogue is indexing, so will probably check the code you are pointing me at - This helps a lot.

    Regarding the generic discovery case, I can see your point. However, my view is that INSPIRE theme classification often leads users of the catalogue to think about "INSPIRE Conformant data sets", which most of times is not the case.

    This brings to the discussion a related aspect - The topicCategory classification from ISO 19115 - which should satisfy this generic classification needs for the discovery case - is usually not clear enough for classifying geospatial data sets.

    What do you think?

    Jordi