Towards robust validation strategies for EO flood maps

Flood maps based on Earth Observation (EO) data inform critical decision-making in almost every stage of the disaster management cycle, directly impacting the ability of affected individuals and governments to receive aid as well as informing policies on future adaptation. However, flood map validat...

Full description

Saved in:
Bibliographic Details
Published in:Remote sensing of environment Vol. 315; p. 114439
Main Authors: Landwehr, Tim, Dasgupta, Antara, Waske, Björn
Format: Journal Article
Language:English
Published: Elsevier Inc 15.12.2024
Subjects:
ISSN:0034-4257
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Flood maps based on Earth Observation (EO) data inform critical decision-making in almost every stage of the disaster management cycle, directly impacting the ability of affected individuals and governments to receive aid as well as informing policies on future adaptation. However, flood map validation also presents a challenge in the form of class imbalance between flood and non-flood classes, which has rarely been investigated. There are currently no established best practices for addressing this issue, and the accuracy of these maps is often viewed as a mere formality, which leads to a lack of user trust in flood map products and a limitation in their operational use and uptake. This paper provides the first comprehensive assessment of the impact of current EO-based flood map validation practices. Using flood inundation maps derived from Sentinel-1 synthetic aperture radar data with synthetically generated controlled errors and Copernicus Emergency Management Service flood maps as the ground truth, binary metrics were statistically evaluated for the quantification of flood detection accuracy for events under varying flood conditions. Especially, class specific metrics were found to be sensitive to the class imbalance, i.e. larger flood magnitudes result in higher metric scores, thus being naturally biased towards overpredicting classifiers. Metric stability across error percentiles and flood magnitudes was assessed through standard deviation calculated by bootstrapping to quantify the impact of sample selection subjectivity, where stratified sampling schemes exhibited the lowest standard deviation consistently. Thoughtful sample and response design were critical, with probability-based random sampling and proportional or equal class allocation vital to producing robust accuracy estimates comparable across study sites, error classes, and flood magnitudes. Results suggest that popular evaluation metrics such as the F1-Score are in fact unsuitable for accurate characterization of map quality and are not comparable across different study sites or events. Overall accuracy and MCC are shown to be the most robust performance metrics when sampling designs are optimized, and bootstrapping is demonstrated to be a necessary tool for estimating variability in map accuracy observed due to the spatial sampling of validation points. Results presented herein pave the way for the development of global flood map validation guidelines, to support wider use of and trust in EO-derived flood risk and recovery products, eventually allowing us to unlock the full potential of EO for improved flood resilience. •Validation designs & metrics are compared for binary flood map accuracy assessment.•Class-specific metrics favor overpredicting classifiers despite optimal sampling.•Use of bootstrapping as a tool to measure robustness of accuracy estimates.•Most binary pattern matching metrics sensitive to flooded area proportion.•Multiple complementary metrics and adjusted designs needed for spatial comparability.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0034-4257
DOI:10.1016/j.rse.2024.114439