European Commission logo
INSPIRE Community Forum

gml:id, gml:identifier and the InspireID – Clarifications and Best Practices

Many of us who create INSPIRE GML, hit some common challenges around identifying features. In part, these come from technical requirements of XML/GML, and in part they come from INSPIRE requirements.

An INSPIRE feature will generally have three properties that identify objects, each with a different purpose. During our work with more than 50 clients, we've used different approaches on how to create gml:ids, InspireIDs and gml:identifiers. In this post, we explain the background of the three identifying properties and present our current best practices.

https://www.wetransform.to/news/2018/02/12/best-practices-for-inspire-ids/

I would be very interested to see what you do, and what you can recommend!

Thorsten

  • Peter PARSLOW

    Thorsten,

    that's a helpful set of notes. It's pretty similar to what we do at OS. In particular, the UK chose not to implement a national ID register, and advises data publishers to use Internet domain names as identifiers (thus delegating the management of identifier namespaces to the existing web domain name governance):

    1. In the general case, where a single 'internal data management' feature instance leads to a single INSPIRE feature instance (spatial object):

    1.1 the internal identifier becomes the gml:id. Some of our internal identifiers are purely numeric, in which case we prefix them to fit in with the XML id rules, with a short string that (to us) identifies the governance of the ID, e.g. "osgb" or "usrn" where the internal identifier is actually the national Unique Street Reference Number, so doesn't "belong" to OS.

    1.2 the internal identfier becomes the inspireID.localId

    1.3 we set the inspireId.namespace to something within our control. We've not been too consistent on this, some get "http://data.ordnancesurvey.co.uk/id"; some get "http://data.ordnancesurvey.co.uk"; some "http://data.os.uk/"

    1.4 the gml:identifier is set to the concatenation of the namespace & localId. Note: so far, the resulting HTTP URI is only de-referencable in the case of OS Open Names (GN dataset) & "Boundary-Line" (AU dataset).

    2. where we split a single internal feature instance into two (or more) INSPIRE spatial objects, we create an "internal identifier" on the fly - a UUID. So far, we only do this for NamedPlaces that represent long roads, because that better supports the gazetteer-search use case (e.g. "A35 near Ringwood").

    2.1 for the gml:id, we prefix this with "ID_" to fit xml:id rules

    Otherwise, it's treated as above.

    3. So far, where  we could be thought of as combining (merging, joining) multiple internal objects, we consider the resulting thing a new object.

    Fortunately, we have a long history of maintaining persistent identifiers for our legacy core data products (TOIDs in OS MasterMap), including quite some understanding of the difficulties this raises for data producers.

    Thanks again for the notes.

    Peter

  • Stefania MORRONE

    By Stefania MORRONE

    Hi Thorsten,

    the creation of PIDs (Persistent Identifiers) for the spatial objects is indeed one of the major challenges in the implementation of the INSPIRE Directive.

    You may find it useful to have a look at the experiences from Spain (creation of a National PID management system) and Romania (adoption of standardized PIDs and short URLs) presented at the INSPIRE Conference in Strasbourg. Links to presentations can be found in this TC post

    https://themes.jrc.ec.europa.eu/discussion/view/152623/pid-management-a-key-topic-to-leveraging-of-inspire-data

    Stefania

  • Thorsten REITZ

    By Thorsten REITZ

    Stefania & Peter, thank you ver ymuch for your notes and comments!

     

    Thorsten

  • Dominique LAURENT

    By Dominique LAURENT

    First, congratulations regarding this summary of good practices on identifiers.

    In IGN France, we have had also a lot of discussions about identifiers when elaborating our matching tables.

    Our main choices:

    - we don't use yet URI

    - we use semantic namespaces

    - if the same source feature is used to derive several INSPIRE features (e.g. the same watercourse used for HY PhysicalWaters and HY HydroNetworl), we ensure unicity by (systematically) adding the name of the INSPIRE application  schema in the namespace

    - in case of merging features, we haven't use the identifiers of merged features but we have tried to find alternative solution; for instance, we had to merge upper level AU from municipalities and we could use the thematic identifier of the upper level AU; regarding InlandWaterway, we have merged the WatercourseLinks belonging to same Watercourse and having same "gabarit" (or CEMT class) => id (Inland Waterway) = id(watercourse) + gabarit value

    - splitting features: we have tried to avoid it. We had some discussion for CP as GM_Surface is the recommended option for INSPIRE but if splitting our GM_MultiSurface features, we would have been unable to ensure the identifier persistency of splitted features.

    We had also a lot of issues regarding the case of transforming an attribute in source data into a feature type in INSPIRE (for themes AD and TN). For the TN properties, we have used the identifier of the geometric object (RoadLink, RailwayLink, ...) with the property name as prefix - as we created a property feature type for each geometric object. For AD, it was more complex and we had to find some thematic identifier of the AddressComponent.