|START Conference Manager|
The geo-Web resources that are not part of any SDI can be an important source of information, or at least an additional source. On the one hand there is the information generated by experts in the field of geographic information but not incorporated into a SDI, and on the other hand, there are geographic information generated by user communities Web, known as "Volunteered Geographic Information (VGI). Recent research highlights the importance of using VGI as a source of information for SDI. Regardless of the source, the geographic Web resources that are not within a SDI could enrich it, but their incorporation requires a great effort to create metadata.
The aim of this paper is to present the work that has been developed to carry out an automatic characterization of Web resources through an extensible architecture. The system consumes a document describing a Web resource, detects automatically the model of the input metadata, discovers type and format of the described resource, extracts necessary information (metadata structures), and generates enhanced metadata according to the input model. The architecture allows to apply different logic for extracting metadata structures, and also to use different logic to improve the existing description (from simple mapping to the analysis of the related resources). A prototype has been dedicated for generation OGC CSW records that describes geoportals Web resources. The main problem was the extraction of geographic extent information (coverage) because the Web pages do not contain the geographic metadata. For this reason, the system has been extended with Natural Language Processing analysis algorithms to extract place names, which then have been translated into the geographical extent that they represent.
Submission Type: Oral Presentation proposal Submission Category: New policies, new requirements, new stakeholders
START Conference Manager (V2.56.8 - Rev. 895)