Utilizing Geoparsing for Mapping Natural Hazards in Europe.

Uloženo v:
Podrobná bibliografie
Název: Utilizing Geoparsing for Mapping Natural Hazards in Europe.
Autoři: Yu, Tinglei, Zhang, Xuezhen, Yin, Jun
Zdroj: Water (20734441); Dec2025, Vol. 17 Issue 24, p3520, 17p
Témata: HAZARDS, NATURAL language processing, STATISTICS, LOCATION data, ACQUISITION of data, DISASTERS, GEOSPATIAL data
Geografický termín: EUROPE
Abstrakt: Natural hazards exert a detrimental influence on human survival, environmental conditions and society. Historical hazard events have generated a broad corpus of literature addressing the spatiotemporal extent, dissemination or social responses. With regard to quantitative analysis based on information locked within verbose text, the release of such information from the narrative format is encouraging. Natural Language Processing (NLP), a technique demonstrated to be capable of automated data extraction, provides a useful tool in establishing a structured dataset on hazard occurrences. In our study, we utilize scattered textual records of historical natural hazard events to create a novel dataset and explore the applicability of NLP in parallel. We put forward a standard list of toponyms based on manual annotation of a compilation of disaster-related texts, all of which were references in an authoritative publication in the field. The final natural hazards dataset comprised location data, which referred to a specific hazard report in Europe during 1301–1500, together with its geocoding result, year of occurrence and detailed event(s). We evaluated the performance of four pre-trained geoparsing tools (Flair, Stanford CoreNLP, spaCy and Irchel Geoparser) for automated toponym extraction in comparion with the standard list. All four tested methods showed a high precision (above 0.99). Flair had the best overall performance (F1 score 0.89), followed by Stanford CoreNLP (F1 score 0.83) and Irchel Geoparser (F1 score 0.82), while spaCy had a poor recall (0.5). Then we divided natural hazards into six categories: extreme heat, snow and ice, wind and hails, rainstorms and floods, droughts, and earthquakes. Finally, we compared our newly digitized natural hazard dataset to a geocoded version of the dataset provided by Harvard University, thus providing a comprehensive overview of the spatial–temporal characteristics of European hazard observations. The statistical outcomes of the present investigation demonstrate the efficacy of NLP techniques in text information extraction and hazard dataset generation, offering references for collaborative and interdisciplinary efforts. [ABSTRACT FROM AUTHOR]
Copyright of Water (20734441) is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze: Biomedical Index
Buďte první, kdo okomentuje tento záznam!
Nejprve se musíte přihlásit.