European Commission logo
INSPIRE Community Forum

Correcting percentageUnderDesignation bug in PS Simple xml schema?

Dear PS experts,

when working on the update of the Annex I xml schemas (MIWP-18a), where the MS agreed we should publish backwards-incompatible new versions of all the schemas, I wondered whether we should use the opportunity to fix the bug with the percentageUnderDesignation in the current schema.

In the IR, the attribute percentageUnderDesignation has the type Percentage. However, then this type is never defined in the IR.

In the schema, the type of percentageUnderDesignation is currently defined as anyURI.

In the data specification and the UML model repository, the Percentage type is defined as a sub-type of Integer, with the definition "A percentage value, being an integer between 0 and 100 inclusive."

If we decide to follow the data specification and UML data model, the type of percentageUnderDesignation should be xs:integer or something like this (more closely following the definition in the TG):

<xs:simpleType name="Percentage">
   <xs:restriction base="xs:integer">
     <xs:minInclusive value="0"/>
     <xs:maxInclusive value="100"/>
   </xs:restriction>
</xs:simpleType>

But I don't know how people currently use the schema (I imagine it is quite difficult to express a percentage using a URI) and what effects such a change would have.

If you agree to this proposal, I would put the publication of the new PS SImple xml schema on hold, until we have reached an agreement on the best way forward here. It should then be discussed in the MIWP-14 sub-group and MIG-T for endorsement.

 

Cheers,

Michael

  • Michael LUTZ

    Reply from Darja Lihteneger:

    Dear Michael, all,

    Thank you for this proposal. I think it is important to update the INSPIRE Ps (Simple application schema) with the correct Percentage type.

    I briefly checked how percentage is used in diverse INSPIRE data specifications. In several cases the type Integer is used for the attributes which are defined for the percentage values (see attached file). This is the case in Land cover, Land use and Buildings. Soils use the real numbers. Protected sites use Percentage. 

    Brian, could you help with some feedback how percentages are used in different biodiversity data bases?

    The CDDA doesn't include any percentage values as, by default the sites selected for the purpose of CDDA are 100% covered by the designations.

    The Natura2000 revised Standard Data Form includes information that should be expressed as percentages, e.g.: percentage of marine areas in the site and the percentage of the site within the bio-geographical region, etc.

    http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32011D0484&from=EN

    http://dd.eionet.europa.eu/schemas/natura2000/sdf_v1.xsd The elements (percentages) in XSD are defined as decimal type, examples:

    <xs:element type="xs:decimal" name="marineAreaPercentage"/>
    <xs:element type="xs:decimal" name="coveragePercentage"/>
    <xs:element type="xs:decimal" name="percentage" minOccurs="0"/>

    For INSPIRE PS, there might be 3 options:

    1. Ideally, it would be great to have one definition and type of percentage used in all INSPIRE specifications. This will be hard to achieve after all existing adoptions of IR and TG. But, the closest and the simplest option would be to keep Percentage as Integer type. Additional examples or explanations could be added to the technical guidelines that values 0 and 100 are included.

    2. Your proposal to limit the Percentage type with minimum and maximum Integer values (0 and 100) would be the second option.

    3. The third option is Kathi's proposal to define Percentage type as following: Should formally be a measure type from 19103 and include units of measurements UoM (the percentage) as follows:

    <ps:percentageUnderDesignation uom=""%"" >100</ps:percentageUnderDesignation>

    From my point of view, the simplest is the first option. The definition of the Percentage is not in the IR and it could be changed in TG.

    Kind regards, Darja

  • Michael LUTZ

    Reply from Brian MacSharry:

    Hi all

    Thanks again for the email. I agree that the percentage issue should be resolved and that it seems to be an error that it was not defined correctly in the first place.

    For some “real world” examples where percentages are used – in most of not all cases the percentage values used at real number e.g 25.6% rather than integers.

    Protected Area:

    -In the CDDA there is a field called "Marine_area_perc" in the "sites" table. This field records the percentage of the protected area that is marine e.g. 25.6% of site X is Marine.

    -In the Natura 2000 Standard Data Form

    Field 2.3 asks for the % of the site that is marine (2.3 Marine area (%)).

    Field 2.6 asks for the percentage of the sites geography that is within the biogeographical  regions or marine regions.

    Field 4.1 asks for the percentage cover of various habitats types within a site- with the total = 100% .

    Field 4.4 (an optional field) asks for the percentage of the sites that is within different types of ownership

    Field 5.1 (an optional field) asks for the percentage of the site covered by existing national designations

    Field 5.2(an optional field) asks for the percentage of the site covered by specific sites at the national and international level [International is used here to mean those sites that are designated under various conventions or similar acts countries have signed up to e.g. World Heritage Sites or Ramsar sites] this is useful information as you often have one geography being covered by several legal designations.  

    Habitats/Species:

    In the current Article 17 reporting being coordinated by the ETC/BD there are several percentage values being used see for example

    Species

    Field 2.3.5 Short term trend magnitude a)minimum: this field asks for the percentage change over a period,

    Field 2.3.8 Long term trend magnitude a) minimum : this field asks for the percentage change over a period,

    Field 2.4.8 Short term magnitude a) minimum : this field asks for the percentage change over a period,

    Field 2.4.8 Short term magnitude c) confidence interval

    Field 2.4.12 Long term trend magnitude a) minimum : this field asks for the percentage change over a period,

    Field 2.4.12 Long term trend magnitude c) confidence interval

    Habitats

    Field 2.3.5 Short term trend magnitude a)minimum: this field asks for the percentage change over a period,

    Field 2.3.8 Long term trend magnitude a) minimum : this field asks for the percentage change over a period

    Field 2.4.6 Short term magnitude a) minimum : this field asks for the percentage change over a period,

    Field 2.4.6 Short term magnitude c) confidence interval

    Field 2.4.10 Long term trend magnitude a) minimum : this field asks for the percentage change over a period,

    Field 2.4.1 Long term trend magnitude c) confidence interval

    I hope this shows that within the existing biodiversity data sets percentages are used as real numbers. It may be feasible to change the existing data standards to stipulate that where percentages are used they are integers- thoug that would involve discussion at the EEA and DG Env level.

    All the best, Brian

  • Michael LUTZ

    Brian, Darja, all,

    thanks for the quick feedback. I originally proposed to use (a constrained form of) xs:integer, because this is what we have in the data specification right now.

    But since we are not actually bound by the IR at this point (because there is no definition for the Percentage type), and following Brian's argument to allow real numbers, we could also go for that (note that in the XML schema Real is mapped to xs:double). However, this would ultimately mean a change in the UML data model and the data specification.

    Or are there any good reasons for staying with integers as currently proposed in the DS?

    And do we need to hard-code in the schema (independent if we choose integers or doubles) that values may only go from 0-100?

    Best regards, Michael

  • Darja LIHTENEGER

    By Darja LIHTENEGER

    Hello,
    Looking into some definitions of percentage, we might use the mathematical definition from Wikipedia, http://en.wikipedia.org/wiki/Percentage : “A percentage is a number or ratio expressed as a fraction of 100 … While percentage values are often between 0 and 100 there is no restriction and one may, for example, refer to 111% or −35%.”

    I would propose to keep the Percentage type open to different use in practice, therefore to allow all values (not limited to range 0..100) and the value should be provided as a decimal value. This will also fulfill the practice in the current reporting obligations (see above).

    In addition to the Percentage type definition, it would be good to add a recommendation and examples how to provide the percentage values for the attribute “percentageUnderDesignation” in the INSPIRE Protected sites. In this case, the percentage will stay in the range of 0-100 (The percentage of the site that falls under the designation) – this could become a requirement in the case of the INSPIRE Protected sites. Adding an example of GML encoding could be useful, too.
    What about other opinions?

    Kind regards,
    Darja

  • Stefania MORRONE

    By Stefania MORRONE

    Dear all,

    since the data type of the attribute percentageUnderDesignation has been changed to 'decimal'

    • in the corrigendum document for the PS Data Specification - Technical Guidelines 3.2
    • in the Protected Sites xml schema v4.0 

    I am closing this discussion.

    All the best

    Stefania

     

This discussion is closed.

This discussion is closed and is not accepting new comments.

Biodiversity & Area Management

Biodiversity & Area Management

If themes like Protected Sites, Area Management/Restriction/Regulation Zones and Reporting Units, Habitats and Biotopes, Species Distribution, Bio-geographical Regions matters to you, join these groups!